I am a policy school graduate and Flatiron Data Science Bootcamp grad with a passion for data, tech, and policy analysis. I am currently working at the New York City Taxi and Limousine Commission, working at the intersection of data analytics and policy research to support the agency's policy initiatives for the for-hire transportation sector.
- Classify whether an issued civil ticket in NYC will result in fine collection using machine learning
- Analyzed over 210,000 civil tickets issued in New York City using 25 associated variables
- Automated download of over 200 files from the Census Bureau using Selenium
- Ran iterations of logistic regression, random forest, decision tree, and XGBoost models to select best model
- Selected random forest as the preferred model with 71% accuracy score and 66% precision score
- Classify Tweets on Google and Apple products into positive emotion, negative emotion or no emotion detected
- Trained 4 different machine learning classification models
- Used different text vectorization tools and machine learning tools to fix class imbalance
- Identified Multinomial Bayes as the best model with a 77% macro precision score
- Multiclass classification modeling to predict risk of injury resulting from traffic crashes in Chicago
- Merged 3 datasets with over 550,000 observations for analysis
- Ran multiple machine learning models and used grid search to identify best parameters
- Selected XGBoost as the preferred model with 95% accuracy score and 87% in macro precision score
- Linear regression modeling to analyze a myriad of housing factors’ impact on house price in King County
- Used home sale data on 15,000 houses and 23 variables to conduct linear regression modeling
- Analyzed and interpreted variables’ coefficients to identify important features associated with house price