Predicting water usage in cities in America.

Objective
As the U.S. climate becomes more volatile and the population continues to rise, informed water use is as important as ever. Myself and a friend of mine, Nathan Jeffries, set out to further explore urban water use after realizing that there is limited publicly available data about city-level water use across the United States. The motivation behind this project was to fill in geographic data gaps which may help better inform policy decisions and improve the use of two vital resources— water and energy.

We aimed to generate a predictive model for urban water use, as well as a model able to predict the amount of electricity needed to process and distribute water, an important part of the energy-water nexus. These models fall under the category of Land Use Regression, or spatial prediction. We explored four types of models: K Nearest Neighbors, Linear regression, Ridge regression, and Lasso regression.

Conclusions
We found that KNN and Ridge models for water use prediction perform the best overall, performing better for the larger, higher water consuming cities. None of our models for processing-electricity performed well which is predominately due to the variation in the types of energy cities use to process their drinking water, be that natural gas, coal, electricity, or others.

The main takeaway from this project is that more primary data is necessary in order to develop accurate prediction models for urban water use, as well as processing energy use. The findings of this project may motivate agencies to prioritize resources towards better data collection of water use in order to illuminate our society's consumption of this precious, declining resource.


Here you can also check out a poster of this project presented at an Energy and Resources department forum hosted by UC Berkeley. Please reach out to me if you have any feedback about this project or would like to collaborate.



water