Python	Numpy	Pandas	Statsmodels	Sci-Kit Learn

Slides live here. Clean notebooks coming soon

Southern California Reservoir Capacity Classifier

Classified drought levels at Summer's end (2000-2013) for reservoirs in Southern California using storage data and exogenous variables.

Problem Statement

California’s recent drought has placed unprecedented demands on our freshwater resources, renewing enthusiasm for surface water infrastructure investments such as raising dams to capture more water in wet years.

Reservoir improvements would need to consider the frequency and the extent to which these dams are depleted.

My analysis looked at time series reservoir storage data around LA county to classify storage levels at summer’s end (Sept 1), given data about the rest of the year.

Guiding Questions

Does climate serve as a valid predictor in classifying water availability?
Can water availability be predicted using earlier monthly storage measurements?

Data

Data sourced from the Department of Water Resources California Data Exchange Center (CDEC)
Climate Data sourced from Berkeley Earth
- Climate Monthly Average Temperature around Los Angeles (Average Temperature and Avg Error from 2000-2013)
Additional information sourced from Wikipedia: Elevation of the dam, year completed, dam type (material), heights (in feet and meters), capacity (in feet and meters.

Analytical Approach

Project Notes:

Adjusted storage measurements as proportions of the reservoir's capacity
Make predictor features out of the storage af dataframe (1/1 - 6/1)
Included climate data
- Created dummy variables
  - Missing values (monthly storage measurements) were the means of the adjacent neighbors
Make classes out of the storage af dataframe (9/1)
Multiclass variables for four different reservoir conditions
Train/Test Splits (climate data only goes to 2013…)
Train set (2000 - 2010)
- Test set (2011-2012)
- Holdout set (2013)
Classification model
Feature engineering
- Parameter optimization
- Future work:
Do I download more historical data (pre-2000?)
Do I incorporate population data? Visualization
Flask & D3.js
- https://www.dashingd3js.com/table-of-contents
- The source: https://github.com/uwdata
- What I really wanted:
  - http://bl.ocks.org/lokesh005/7640d9b562bf59b561d6
  - https://www.ucas.com/corporate/data-and-analysis/ucas-undergraduate-releases/equality-he-reports

Tool Stack

AWS t2.micro EC2 instance with a PostgreSQL database
Jupyter notebook
Python 3.5
Pandas, Matplotlib, Seaborn
Sci-kit Learn
Plotly

Might include step by step series of examples that tell you have to get a development env running

Visuals [In Development]

Conclusions

More to come. Will explain insights gleaned, model evaluation, or patterns in visualization.

Best Performer: Random Forest Classifier

Max Depth: 3
Number of estimators = 3

More to come

Limitations

assumes business as usual water demand
No natural disasters (wildfires and earthquakes)
A static population size
Unchanging urban, ag, and environmental uses
Limited to reservoirs that had data available on CDEC
Storage is the most complete predictor variable, with most reservoirs containing public data on storage
Reservoirs that had recent data (2000-2017) were used in this analysis (as recent years give context to contemporary population size, consumption, water demand, etc).
Some reservoirs had storage data dating back from the 80s to 2001
Why CDEC stopped recording monthly storage data for some reservoirs, idk

Future Work

Explain what next steps could involve

Author

Andrew Tom

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Acknowledgments

Problem & Visual Inspiration: http://ww2.kqed.org/lowdown/2015/09/21/now-that-summers-over-what-do-californias-reservoirs-look-like-a-real-time-visualization/
etc

caheredia / ca-reservoir-capacity-classifier Goto Github PK

ca-reservoir-capacity-classifier's Introduction

Southern California Reservoir Capacity Classifier

Problem Statement

Data

Analytical Approach

Tool Stack

Visuals [In Development]

Conclusions

Limitations

Future Work

Author

License

Acknowledgments

ca-reservoir-capacity-classifier's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent