Git Product home page Git Product logo

Elliott Einstein's Projects

big-data-challenge--amazon-shoppers-product-reviews icon big-data-challenge--amazon-shoppers-product-reviews

In this assignment I will put my ETL skills to the test. Many of Amazon's shoppers depend on product reviews to make a purchase. Amazon makes these datasets publicly available. However, they are quite large and can exceed the capacity of local machines to handle. One dataset alone contains over 1.5 million rows; with over 40 datasets, this can be quite taxing on the average local computer. My first goal for this project will be to perform the ETL process completely in the cloud and upload a DataFrame to an RDS instance. The second goal will be to use PySpark or SQL to perform a statistical analysis of selected data.

dataframes_operations icon dataframes_operations

I will import a CSV file into a DataFrame, use the melt function to reshape the DataFrame in order to analyze and visualize the data, and then I'll export the DataFrame as a CSV file.

deep-learning-charity-funding-predictor icon deep-learning-charity-funding-predictor

The non-profit foundation Alphabet Soup wants to create an algorithm to predict whether or not applicants for funding will be successful. With my knowledge of machine learning and neural networks, I'll use the features in the provided dataset to create a binary classifier that is capable of predicting whether applicants will be successful if funded by Alphabet Soup.

etl-with-pandas-project icon etl-with-pandas-project

Project uses Pandas to create multiple DataFrames from CSV files containing Disneyland Reviews and Chocolate Reviews.. Cleaned those DataFrames, then loaded to PostgreSQL to create a relational database to join everything together.

exploring-pandas icon exploring-pandas

Analyzing a Yelp review dataset of coffee shops in Austin, Texas. Use skillset to practice importing a dataset and retrieve a subset of values based on a set of provided conditionals.

heroes-of-pymoli-video-game-analysis- icon heroes-of-pymoli-video-game-analysis-

Heroes Of Pymoli: Like many others in its genre, the game is free-to-play, but players are encouraged to purchase optional items that enhance their playing experience. As a first task, the company would like you to generate a report that breaks down the game's purchasing data into meaningful insights.

indicators-of-heart-disease-analysis icon indicators-of-heart-disease-analysis

This project is about statistically analyzing risk factors for heart disease and performing A/B testing, descriptive and inferential statistics to provide health care plans and strategies to better understand the risk factors assocaited with heart disease and give key insights into what factors contribute most heavily and least heavily to the development of heart disease.

machine-learning-cryptocurrency-clusters icon machine-learning-cryptocurrency-clusters

I am on the Advisory Services Team of a financial consultancy. One of MY clients, a prominent investment bank, is interested in offering a new cryptocurrency investment portfolio for its customers. The company, however, is lost in the vast universe of cryptocurrencies. They’ve asked me to create a report that includes what cryptocurrencies are on the trading market and determine whether they can be grouped to create a classification system for this new investment.

maplotlib-pymaceuticals-inc. icon maplotlib-pymaceuticals-inc.

Provided summary statistics table for the dataset’s measures of central tendency including variance, standard deviation, and SEM of the tumor volume for each drug regimen. Generated plots that show the number of total mice for each treatment regimen throughout the course of the study and the distribution of female or male mice. Calculated the final tumor volume of each mouse across four of the most promising treatment regimens: Capomulin, Ramicane, Infubinol, and Ceftamin. Generated a box and whisker plot of the final tumor volume for all four treatment regimens and highlighted any potential outliers. Selected a mouse that was treated with Capomulin and generated a line plot of tumor volume vs. time point for that mouse. Generated a scatter plot of mouse weight versus average tumor volume for the Capomulin treatment regimen. Calculated the correlation coefficient and linear regression model between mouse weight and average tumor volume for the Capomulin treatment. Put together all of the tables and figures needed for the technical report and provided top-level summary of the study results.

netflix-data-science-midterm-project icon netflix-data-science-midterm-project

Project Name :Analysis of Video Games Sales Project description This project is about statistically analyzing platform, genre, game rating, user score, and regional user-preferences against 11563 video games dating back from 1984 to 2016 for effective marketing strategies. We use descriptive statistic to understand user trends which is necessary to target our audiences and appeal to their preferences.

nyc_bike_counts_retrospective_analysis icon nyc_bike_counts_retrospective_analysis

I perform a retrospective analysis on the linear regression analysis that I previously performed on the NYC Bike Counts dataset. Specifically, I analyze my linear regression analysis to identify anything that I could have done differently.

nyc_bike_linear_regression- icon nyc_bike_linear_regression-

I used the New York Bike Counts dataset to formulate a hypothesis about the number of bikes crossing the Brooklyn Bridge. This dataset contains the number of bikes that crossed each bridge during each day. I first used this dataset to formulate a hypothesis and then used linear regression to test if my hypothesis was correct.

programming-with-python- icon programming-with-python-

Call to action: Using Python to import a Time Series CSV dataset that contains S&P 500 stock data from 2016 to 2020

pymaceutical_anova_tukey icon pymaceutical_anova_tukey

For this Project, I first applied an analysis of variance (ANOVA) model to the Pymaceutical dataset and then did a post-hoc analysis of the results by using Tukey Honest Significant Difference (HSD) to determine which drug treatments in the dataset significantly reduce tumor volume and metastasis. I then wrote a summary of my findings.

pymaceuticals icon pymaceuticals

In this study, 249 mice identified with SCC tumor growth were treated through a variety of drug regimens. Over the course of 45 days, tumor development was observed and measured. The purpose of this study was to compare the performance of Pymaceuticals' drug of interest, Capomulin, with the other treatment regimens. You have been tasked by the senior scientist team to generate an initial drug regimen comparison and a summary of your findings.

pymaceuticals-continued-making-matplotlib-magic icon pymaceuticals-continued-making-matplotlib-magic

It has been a few days since you sent your boxplot to the senior scientist at Pymaceuticals and today they finally got back to you with feedback. They said your inital For this, I will leverage the same drug regimen data from last class and utilize subplots to create an advanced visualization that is packed with insightful information!

python-api-weather-project icon python-api-weather-project

A weather analysis that randomly selects more than 500 cities across the globe, pulls data from the OpenWeatherMap API for each city. Analysis of the weather and perfect vacation spot is viewable on my Jupyter Notebook.

python-challenge icon python-challenge

In this challenge, I am are tasked with creating a Python script for analyzing the financial records of my company. I will give a set of financial data called budget_data.csv. The dataset is composed of two columns: Date and Profit/Losses. (Thankfully, my company has rather lax standards for accounting so the records are simple.)In this challenge, I am tasked with helping a small, rural town modernize its vote counting process.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.