allan34kirwa Goto Github PK
Name: Allan Kirwa
Type: User
Bio: 🚀 Data Science Enthusiast | Code Cruncher | 📊 Let's turn data into decisions! [Python, SQL, NoSQL (MongoDB), Power BI, Excel]
Location: Nairobi, Kenya
Name: Allan Kirwa
Type: User
Bio: 🚀 Data Science Enthusiast | Code Cruncher | 📊 Let's turn data into decisions! [Python, SQL, NoSQL (MongoDB), Power BI, Excel]
Location: Nairobi, Kenya
This repo showcases a data preprocessing and feature selection pipeline for crop yield prediction. It includes dummy variable encoding, variance thresholding, and comparison of linear regression models using all features versus selected features.
This repo describes an analysis of the mtcars dataset using Python libraries such as pandas, numpy, matplotlib, seaborn, and statsmodels. It includes data preprocessing, visualization for linearity and multicollinearity, and fitting a linear regression model for predicting mpg.
This repo includes a regression analysis of the mtcars dataset using statsmodels.OLS in Python. It covers model fitting, residual analysis for independence, homoscedasticity, normality, and outlier detection using Cook's distance.
This repo contains Python code for data cleaning and visualization of football player statistics. Using pandas, matplotlib, and seaborn, it cleans the data, handles anomalies, and explores player attributes like age and overall rating. Visualizations include histograms, joint plots, box plots, violin plots, and facet grids.
This repository contains a comprehensive project focused on air quality prediction in Dar es Salaam. It includes data extraction from MongoDB, exploratory data analysis, time series data cleaning, autoregression model development, and hyperparameter tuning.
This repo showcases MongoDB data retrieval and analysis for air quality monitoring in Nairobi. Utilizing pymongo, it fetches PM2.5 readings from specific sites and aggregates data. Comprehensive examples cover querying, counting, and retrieving distinct values, facilitating robust data exploration.
"This repo implements ARMA models for forecasting PM2.5 levels in Nairobi. It includes data preprocessing, model training with grid search, forecasting using Walk Forward Validation, and visualization.
This repo implements AR models for air quality forecasting in Nairobi, Kenya. It includes data wrangling, model training, forecasting, and evaluation. Key features comprise time series analysis, model validation, and visualization using various Python libraries
Analyzing Brazilian real estate data, visualizing property prices and sizes across regions. Includes data cleaning, exploration, and insights through descriptive statistics and visualizations
This repo contains a Python project that analyzes real estate data from Buenos Aires. It cleans, explores, and models the data to predict apartment prices based on location coordinates (latitude and longitude). The project showcases data preprocessing, exploratory data analysis (EDA), and machine learning techniques such as linear regression.
This repo contains a predictive model for apartment prices in Buenos Aires, focusing on properties under $400,000 USD. It includes data wrangling, exploratory data analysis, model building with linear regression, and evaluation. The model predicts prices based on apartment size, offering insights into real estate trends in the city.
This repo presents a housing price analysis of Buenos Aires neighborhoods. It imports CSV files, extracts neighborhood data, and uses one-hot encoding. Model building tackles overfitting with regularization and addresses the curse of dimensionality. Visualizations, like horizontal bar charts, offer insights into housing price variations in hoods.
This repo implements customer segmentation using KMeans clustering on financial data. It explores feature variance, builds & evaluates a 4-cluster model to identify customer groups.
This repo contains a data analysis project using K-Means clustering on consumer finance data. It explores household debt and home values, identifying clusters and providing insights for targeted financial products. Findings are visualized and discussed for practical application.
This repo analyzes household financial behaviors using the Survey of Consumer Finances dataset. Exploring demographics, income, debt, and assets, it highlights differences between credit-fearful and non-fearful groups, offering insights valuable for financial institutions.
This repository contains functions to manage farm vehicles. Users can create new tractor instances and add them to a list of vehicles. Functions are provided to assist in handling farm vehicle data efficiently.
This repo contains a multiple linear regression model built to understand factors influencing biodiversity index across countries. It includes data exploration, correlation analysis, model building, and diagnostic tests such as homoscedasticity and Cook's distance for outlier detection.
This repo hosts scripts for time series analysis and forecasting of air quality data, specifically focusing on PM2.5 readings in Nairobi. It includes data wrangling, model training (linear regression), evaluation, and visualization techniques using Python libraries like pandas and scikit-learn.
This repo contains a data pipeline for agricultural data validation. Utilizing Python, SQLAlchemy, and Pandas, it streamlines data ingestion, cleaning, and integration with weather station data. Ensuring accurate insights through hypothesis testing and statistical analysis.
This repo implements a machine learning model to predict real estate prices in Mexico City. It preprocesses data, incorporates one-hot encoding, imputation, and Ridge regression, achieving accurate price approximations.
This repo provides a comprehensive model for predicting house prices in Buenos Aires based on key factors such as size, location, and neighborhood. It includes data preprocessing steps, model building using Ridge regression, and an interactive dashboard for exploring how different parameters influence predicted prices.
This repo hosts Python code for agricultural data analysis. It includes tasks such as identifying unique crop types, calculating yield and rainfall statistics, and conducting hypothesis testing. Explore insightful analyses and visualizations tailored for agricultural insights.
This repo showcases model serialization in Python using Pickle for saving and restoring machine learning models. It includes examples of training a linear regression model on diabetes and crop yield datasets, as well as making predictions with the saved model.
This repo showcases interactive visualizations and analyses of football player data using Python. Explore top nationalities and player attributes' impact.
This repo showcases a data analysis project using Python, focusing on modeling the South African Rand (ZAR) to US Dollar (USD) exchange rate using linear regression. It includes data visualization, linear regression implementation, and error analysis for model evaluation.
This repo contains a Python project that preprocesses the Crop_yield dataset using dummy variable encoding and variance thresholding. It then trains two linear regression models, one using all features and another using only selected features.
This repo showcases a machine learning project on predicting loan amounts using the personal_loans dataset. It covers data preprocessing, dummy variable encoding, correlation analysis, and building an OLS regression model to predict loan sizes based on customer attributes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.