Git Product home page Git Product logo

drroad's Projects

data-explorer icon data-explorer

'Easy' web-enabled Exploratory Data Analysis with Shiny :mag_right:

data-quality-explorer icon data-quality-explorer

Web-based tool to monitor information about sensor quality and framework reliability. In the CityPulse Framework it provides a visualisation of the Monitoring Component.

data-science icon data-science

EDA and Machine Learning Models in R (Regression, Classification, SVM, Decision Tree, Random Forest, Time-Series Analysis, Recommender System, XGBoost)

data-science-live-book icon data-science-live-book

An open source book to learn data science, data analysis and machine learning, suitable for all ages!

data-science-machine-learning-ai-big-data-resources icon data-science-machine-learning-ai-big-data-resources

A curated set of resources for data science, machine learning, artificial intelligence (AI), big data, and more. Includes links, notebooks, articles, books, cheat sheets, whitepapers, lists, and technical papers

data-summariser icon data-summariser

A shiny app to drag and drop csv files to get fast data summaries of the data set.

data4people-women-s-health-risk-assessment icon data4people-women-s-health-risk-assessment

Summary In this article, we are going to present the solution for the Women’s Health Risk Assessment data science competition on Microsoft’s Cortana Intelligence platform which was ranked among the top 5%. In this page, you can find the published Azure ML Studio experiment., a description of the data science process used, and finally a link to the R code (in GitHub). Competition Here is the description from the Microsoft Cortana Competition “To help achieve the goal of improving women's reproductive health outcomes in underdeveloped regions, this competition calls for optimized machine learning solutions so that a patient can be accurately categorized into different health risk segments and subgroups. Based on the categories that a patient falls in, healthcare providers can offer an appropriate education and training program to patients. Such customized programs have a better chance to help reduce the reproductive health risk of patients. This dataset used in this competition was collected via survey in 2015 as part of a Bill & Melinda Gates Foundation funded project exploring the wants, needs, and behaviors of women and girls with regards to their sexual and reproductive health in nine geographies. The objective of this machine learning competition is to build machine learning models to assign a young woman subject (15-30 years old) in one of the 9 underdeveloped regions into a risk segment, and a subgroup within the segment.” https://gallery.cortanaintelligence.com/Competition/Womens-Health-Risk-Assessment-1 Dataset The contains 9000 observations The original training dataset is in CSV format and can be found in the competition’s description. To submit a solution, two options are possible: build it in Azure ML Studio or build your solutions locally in R and then submit it through Azure ML Studio. An Azure ML’s solution, and a R script code where given as example. The two solutions are based on the use of a Generalized Linear Model is automatically downloaded. You can find a detailed description of the dataset, the R sample Code and a tutorial using Azure ML and R in the competition page Solution I started following the R tutorial for this competition. Then I have submitted the exact same R solution. The sample model has a 77% accuracy Pre-processing & Cleaning The first thing I did was changing the initial multinomial model (nnet package) for a random forest model (RandomForest package). All missing values have been replaced by 0 Feature selection Features have been selected using the function varImpPlot from the randomforest package Parameter tuning I have chosen (for educational matter) to use the module Tune Model Hyperparameters in Azure ML Studio. I could have also used the R Package Caret. Evaluation The final model has an accuracy of 86.36% (18 position over almost 500 participants) You can download the R code here

datacollector icon datacollector

StreamSets Data Collector - Continuous big data and cloud platform ingest infrastructure

dataease icon dataease

人人可用的开源数据可视化分析工具。

datapackager icon datapackager

An R package to enable reproducible data processing, packaging and sharing.

dataprep_app icon dataprep_app

A Shiny app for data preparation, meant primarily for use with ONDRI data as part of the ONDRI NIBS standards and outliers pipeline.

datarobot icon datarobot

:exclamation: This is a read-only mirror of the CRAN R package repository. datarobot — 'DataRobot' Predictive Modeling API

datascienceapp icon datascienceapp

Deploying pre-trained NLP and computer vision models in a single application

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.