Git Product home page Git Product logo

Allan Kirwa 's Projects

-exercise-variables-and-variable-selection icon -exercise-variables-and-variable-selection

This repo showcases a data preprocessing and feature selection pipeline for crop yield prediction. It includes dummy variable encoding, variance thresholding, and comparison of linear regression models using all features versus selected features.

-multiple-linear-regression-advanced-regression-analysis icon -multiple-linear-regression-advanced-regression-analysis

This repo describes an analysis of the mtcars dataset using Python libraries such as pandas, numpy, matplotlib, seaborn, and statsmodels. It includes data preprocessing, visualization for linearity and multicollinearity, and fitting a linear regression model for predicting mpg.

advanced-multiple-regression-analysis-2 icon advanced-multiple-regression-analysis-2

This repo includes a regression analysis of the mtcars dataset using statsmodels.OLS in Python. It covers model fitting, residual analysis for independence, homoscedasticity, normality, and outlier detection using Cook's distance.

advanced-visualization-python icon advanced-visualization-python

This repo contains Python code for data cleaning and visualization of football player statistics. Using pandas, matplotlib, and seaborn, it cleans the data, handles anomalies, and explores player attributes like age and overall rating. Visualizations include histograms, joint plots, box plots, violin plots, and facet grids.

air-quality-in-dar-es-salaam icon air-quality-in-dar-es-salaam

This repository contains a comprehensive project focused on air quality prediction in Dar es Salaam. It includes data extraction from MongoDB, exploratory data analysis, time series data cleaning, autoregression model development, and hyperparameter tuning.

air-quality-nairobi-part-wrangling-data-mongodb- icon air-quality-nairobi-part-wrangling-data-mongodb-

This repo showcases MongoDB data retrieval and analysis for air quality monitoring in Nairobi. Utilizing pymongo, it fetches PM2.5 readings from specific sites and aggregates data. Comprehensive examples cover querying, counting, and retrieving distinct values, facilitating robust data exploration.

arma-models icon arma-models

"This repo implements ARMA models for forecasting PM2.5 levels in Nairobi. It includes data preprocessing, model training with grid search, forecasting using Walk Forward Validation, and visualization.

autoregressive-models- icon autoregressive-models-

This repo implements AR models for air quality forecasting in Nairobi, Kenya. It includes data wrangling, model training, forecasting, and evaluation. Key features comprise time series analysis, model validation, and visualization using various Python libraries

brazil-housing-project-4 icon brazil-housing-project-4

Analyzing Brazilian real estate data, visualizing property prices and sizes across regions. Includes data cleaning, exploration, and insights through descriptive statistics and visualizations

buenos-aires-houses-prices-with-location icon buenos-aires-houses-prices-with-location

This repo contains a Python project that analyzes real estate data from Buenos Aires. It cleans, explores, and models the data to predict apartment prices based on location coordinates (latitude and longitude). The project showcases data preprocessing, exploratory data analysis (EDA), and machine learning techniques such as linear regression.

buenos-aires-housing-predicting-price-size- icon buenos-aires-housing-predicting-price-size-

This repo contains a predictive model for apartment prices in Buenos Aires, focusing on properties under $400,000 USD. It includes data wrangling, exploratory data analysis, model building with linear regression, and evaluation. The model predicts prices based on apartment size, offering insights into real estate trends in the city.

buenos-aires-predicting-price-with-neighborhood icon buenos-aires-predicting-price-with-neighborhood

This repo presents a housing price analysis of Buenos Aires neighborhoods. It imports CSV files, extracts neighborhood data, and uses one-hot encoding. Model building tackles overfitting with regularization and addresses the curse of dimensionality. Visualizations, like horizontal bar charts, offer insights into housing price variations in hoods.

clustering-with-multiple-features icon clustering-with-multiple-features

This repo implements customer segmentation using KMeans clustering on financial data. It explores feature variance, builds & evaluates a 4-cluster model to identify customer groups.

clustering-with-two-features icon clustering-with-two-features

This repo contains a data analysis project using K-Means clustering on consumer finance data. It explores household debt and home values, identifying clusters and providing insights for targeted financial products. Findings are visualized and discussed for practical application.

customer-segmentation-usa-eda- icon customer-segmentation-usa-eda-

This repo analyzes household financial behaviors using the Survey of Consumer Finances dataset. Exploring demographics, income, debt, and assets, it highlights differences between credit-fearful and non-fearful groups, offering insights valuable for financial institutions.

data-structures-code-challenge icon data-structures-code-challenge

This repository contains functions to manage farm vehicles. Users can create new tractor instances and add them to a list of vehicles. Functions are provided to assist in handling farm vehicle data efficiently.

exercise-multiple-linear-regression icon exercise-multiple-linear-regression

This repo contains a multiple linear regression model built to understand factors influencing biodiversity index across countries. It includes data exploration, correlation analysis, model building, and diagnostic tests such as homoscedasticity and Cook's distance for outlier detection.

linear-regression-with-time-series-data icon linear-regression-with-time-series-data

This repo hosts scripts for time series analysis and forecasting of air quality data, specifically focusing on PM2.5 readings in Nairobi. It includes data wrangling, model training (linear regression), evaluation, and visualization techniques using Python libraries like pandas and scikit-learn.

maji-ndogo-project-validating-data icon maji-ndogo-project-validating-data

This repo contains a data pipeline for agricultural data validation. Utilizing Python, SQLAlchemy, and Pandas, it streamlines data ingestion, cleaning, and integration with weather station data. Ensuring accurate insights through hypothesis testing and statistical analysis.

predicting-apartment-prices-in-mexico-city-mx icon predicting-apartment-prices-in-mexico-city-mx

This repo implements a machine learning model to predict real estate prices in Mexico City. It preprocesses data, incorporates one-hot encoding, imputation, and Ridge regression, achieving accurate price approximations.

predicting-price-with-size-location-and-neighbor icon predicting-price-with-size-location-and-neighbor

This repo provides a comprehensive model for predicting house prices in Buenos Aires based on key factors such as size, location, and neighborhood. It includes data preprocessing steps, model building using Ridge regression, and an interactive dashboard for exploring how different parameters influence predicted prices.

python-exams-practical icon python-exams-practical

This repo hosts Python code for agricultural data analysis. It includes tasks such as identifying unique crop types, calculating yield and rainfall statistics, and conducting hypothesis testing. Explore insightful analyses and visualizations tailored for agricultural insights.

saving-and-restoring-models-in-python icon saving-and-restoring-models-in-python

This repo showcases model serialization in Python using Pickle for saving and restoring machine learning models. It includes examples of training a linear regression model on diabetes and crop yield datasets, as well as making predictions with the saved model.

simple-linear-regression-1 icon simple-linear-regression-1

This repo showcases a data analysis project using Python, focusing on modeling the South African Rand (ZAR) to US Dollar (USD) exchange rate using linear regression. It includes data visualization, linear regression implementation, and error analysis for model evaluation.

variables-and-variable-selection icon variables-and-variable-selection

This repo contains a Python project that preprocesses the Crop_yield dataset using dummy variable encoding and variance thresholding. It then trains two linear regression models, one using all features and another using only selected features.

variables-and-variable-selection-part-1 icon variables-and-variable-selection-part-1

This repo showcases a machine learning project on predicting loan amounts using the personal_loans dataset. It covers data preprocessing, dummy variable encoding, correlation analysis, and building an OLS regression model to predict loan sizes based on customer attributes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.