sohat7 Goto Github PK
Name: Soha Tariq
Type: User
Company: Columbia Engineering - Columbia University
Bio: Data Analyst | SQL, Excel, Python, Tableau, R | Mathematics & Statistics, and Economics | Columbia Engineering | University of Virginia
Name: Soha Tariq
Type: User
Company: Columbia Engineering - Columbia University
Bio: Data Analyst | SQL, Excel, Python, Tableau, R | Mathematics & Statistics, and Economics | Columbia Engineering | University of Virginia
Analysis to determine if there is a positivity bias in product reviews written by Amazon Vine members. The ETL process and analysis uses PySpark, Python (Pandas), SQL, Amazon Web Services (AWS), and pgAdmin.
Creates a dashboard to analyze the NYC Citi Bike sharing data, in order to convince investors to start a similar bike sharing program in another city. Uses Tableau and Python (Pandas library).
Runs queries on the 59 million records in the BigQuery public dataset New York Citibike, in addition to making data visualizations on Google Cloud Platform (GCP), using Cloud SQL (MySQL), Vertex AI, Cloud Shell, and Cloud Storage buckets in Google Cloud Platform (GCP).
Creates an interactive dashboard which allows the individual to visualize their bellybutton bacterial biodiversity data. Uses JavaScript (Plotly.js and D3.js libraries) and HTML.
Supervised machine learning models built and evaluated to predict credit loan risk. Resampling and ensemble techniques applied to the logistic regression classifier models using Scikit-learn, Imbalanced-learn, Pandas, and NumPy libraries in Python.
Unsupervised machine learning algorithms (PCA dimensionality reduction and K-means clustering algorithm) report on tradable cryptocurrencies and create a classification system for them, using Scikit-learn, Pandas, Plotly, and hvPlot in Python.
Conducts an election audit of a local congressional election, and computes the county and candidate with the highest number of votes. Uses Python.
Computer Vision analysis (image classification) on GCP which builds and evaluates machine learning models - Deep Neural Network (DNN), Convolutional Neural Network (CNN), and Deep Convolutional Neural Network (DCNN) - using 60,000 labeled images of handwritten 0-9 digits in the MNIST dataset. Makes predictions with a 99.3% accuracy level.
Analyzes a dataset consisting of 4,000 crowdfunding projects to discover hidden trends (campaign performance based on launch dates and funding goals). Uses Excel.
Create an interactive world map for earthquakes, with multiple layers added for different features and view modes, using JavaScript (D3 library), Leaflet, and Mapbox.
The analysis uses R language to run a multiple linear regression, t-tests, and generate summary statistics, in order to aid an automotive company in identifying the production troubles that are hindering the manufacture of a prototype car of theirs.
Creates a web app that scrapes and displays the most recently published data on Mars and the Mars mission. Uses Python (html5lib and lxml libraries), MongoDB, Flask-PyMongo, Splinter, BeautifulSoup, and Web-Driver Manager.
Creates an automated ETL (Extract, Transform, Load) pipeline that extracts (from three data files), transforms, and loads data into a movies database. Uses Python (Pandas), PostgreSQL, and SQL.
Deep-learning neural network (binary classifier) to determine which organizations are worth donating to and which ones are high-risk. Uses Python (TensorFlow, Pandas, and Scikit-Learn libraries).
Data visualization of the NYC restaurant data, and data analysis to gauge if a restaurant located in a high-income area receives a higher health inspection grade. Uses Python (Pandas, Scikit-learn, Imbalanced-learn), PostgreSQL, SQLAlchemy, Tableau, JavaScript (Plotly.js library), HTML, CSS, and Bootstrap.
Determines how many employees at the company will soon be retiring, and how many among those are eligible for mentoring the new hires. Uses SQL, postgreSQL, and pgAdmin.
Check out my projects, all in one place, here at: https://sohat7.github.io/Portfolio/
This project uses BigQuery to explore the Google Analytics dataset, and build a Machine Learning Model to predict whether a visitor on the website will make a purchase or not.
Analysis and visualization of ride-sharing data to determine how total weekly fares differ by city type. Uses Python (Pandas and Matplotlib libraries).
Produces district-level and school-level summary on the math, reading, and overall passing percentages in the schools, and repeats the analysis after ninth grade scores for math and reading have to be replaced. Uses Python.
Config files for my GitHub profile.
Analyzes the performance of 12 different stocks through the years 2017 and 2018 in order to enable the client to make informed decisions when it comes to investing in stocks. Uses VBA (Visual Basic for Applications).
Analyzes climate data to determine if opening up a surf shop in the location will make for a viable investment or not. Uses Python (Pandas), SQLAlchemy, SQLite, and Flask.
Creates a dynamic table, where the user can filter UFO sightings on multiple criteria simultaneously. Uses JavaScript, HTML, CSS, and Bootstrap.
Creates a travel itinerary map based on the customer's weather preferences. Uses Python (Pandas, Matplotlib, and SciPy libraries) and APIs.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.