Utsav Chaudhary's Projects
The Amazon Vine program is a service that allows manufacturers and publishers to receive reviews for their products. We had access to approximately 50 datasets. Each one contains reviews of a specific product, from clothing apparel to wireless products. We picked one of these datasets, video game. We used PySpark to perform the ETL process to extract the dataset, transformed the data, connected to an AWS RDS instance, and loaded the transformed data into pgAdmin. Next, we used Pandas to determine if there is any bias toward favorable reviews from Vine members in your dataset. We summarized of the analysis for Jennifer to submit to the SellBy stakeholders.
#Citibike We have created a dashboard and story on Tableau to convince investors that a bike-sharing program in Des Moines is a solid business proposal. To solidify the proposal, one of the key stakeholders would like to see a bike trip analysis. hereby, we created new dataframe using citybike system data page.
A collaborative project looking into the likelihood of Covid-19 infection in the United States.
Credit risk is an inherently unbalanced classification problem, as good loans easily outnumber risky loans. Therefore, we needed to employ different techniques to train and evaluate models with unbalanced classes. Jill asks us to use imbalanced-learn and scikit-learn libraries to build and evaluate models using resampling
We created a report that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment.The data Martha provided us was not ideal, so we processed to fit the machine learning models. Since there is no known output for what Martha is looking for, we decided to use unsupervised learning. To group the cryptocurrencies, Martha and us decided on a clustering algorithm. We used data visualizations to share our findings with the board.
software: python 3.6.1 , visual studio code. : Project Overview A Colorado Board of Elections employee has given our company tasks to complete the election audit of a recent local congressional election.
Performing analysis on Kickstarter data to uncover trends. The purpose of this analysis is to find the outcome on given criteria: Based on launch date Based on goals Based on pledged amount Successful and canceled shows
Our company helped Basil and Sadhana to create the map to see the earthquake data in relation to the tectonic platesβ location on the earth, and helped to see all the earthquakes with a magnitude greater than 4.5 on the map, and we came up with solution to see the data on a third map. We used the JavaScript and the D3.js library to retrieve the coordinates and magnitudes of the earthquakes from the GeoJSON data. We also used the Leaflet library to plot the data on a Mapbox map through an API request and created interactivity for the earthquake data. resources used: Leaflet library,Javascript, D3.js library, API
MechaCar prototypes Collected summary statistics on the pounds per square inch (PSI) of the suspension coils from the manufacturing lots Ran t-tests to determine if the manufacturing lots are statistically different from the mean population Designed a statistical study to compare vehicle performance of the MechaCar vehicles against vehicles from other manufacturers.
Web scraping methods to extract data and identify HTML components, using Beautiful Soup/Splinter to automate the scrape, MongoDB to store the data, and Flask to display the data.
The company Pewlett-Hackard wants to know the number of employess and the titles.The company also asked us to come up with numbers of elegible mentorship program. Number of retiress, their titles and departments.
Do visit my website: https://utsavchaudharygithub.github.io/ We created website for Roza to her volunteers to identify the top 10 bacterial species in their belly buttons. That way, if Improbable Beef identifies a species as a candidate to manufacture synthetic beef, Roza's volunteers will be able to identify whether that species is in naval.
Visualize with several charts which includes box and whiskers plot, pie charts and line charts. Here we compared different areas ; Urban, Suburban and rural areas based on drivers count and fare and Ride count datas .
Exploring data, using NANs, finding different dataframes to build clean datas. Filter different dataframes to create a better sub datas.
assigned to retrieve datas on Oahu, Hawaii for our client Surf n' Shake. Our Data source was hawaii.sqlite, which allowed our client to forecast on his icecream store. Code, Jupyter Notebook, Panda, SQLite, Flask, Python
provide a more in-depth analysis of UFO sightings by allowing users to filter for multiple criteria at the same time. We added filtered datas on the tables given different city, state, country and shapes.
Config files for my GitHub profile.
API and Weather Data visualization: Data Visualization based on weather data, vacation itinerary and vacation search