Git Product home page Git Product logo

machine-learning-resource-box's Introduction

Machine Learning Resource Box

Here's a repository that you always wanted! We've compiled a list of resources that WE think can aid you best in your learning of Machine Learning.

What is Machine Learning?

It's the cutting edge of Computer Programming. You write programs that allow software applications to become more accurate in predicting systems outcomes without being explicitly programmed.Fun isn't it?

Get an understanding here:

  • AndrewNg Machine Learning Course Link
    OR
  • Google's Machine Learning Crash Course Link

Software Requirements:

  • Anaconda Prompt Link
  • After installing prompt enter the following commands:
    • pip install pandas
    • pip install numpy
    • pip install skypy
    • pip install seaborn
    • pip install matplotlib
    • pip install sklearn

What can you do after learning ML?

Once you have the basics down, ramp up your skills by applying ML techniques to big data-sets in real-world applications. Other paths take you further into data science, and innovative ML approaches like deep learning,Reinforcement Learning and Generational Neural Networks.

Any of these:

  • Deep Learning Specialization Link
  • Introduction to Reinforcement Learning (Google DeepMind) Link
  • Stanford Lectures Link

The Books to assist in moving forward?

As most of the Courses are done on Research Software like R, Octave.These books can help you do the same on Python which is a more popular and commonly used machine learning language

Any of these:

  • Sebastian Raschka python machine learning-This is one of the most referred book for Machine Learning Concepts and Implementation.

Which Datasets will help you test your Machine Learning Skills?

Kaggle is a great place for hunting datasets but you cannot obviously start working with any dataset right off the bat so here are some particular datasets which can help you test your Machine Learning Skills if you just starting up with it.

Any of these:

  • Titanic Dataset(Classification Problem) Link
  • Iris Dataset(Classification Problem) Link
  • MNIST Dataset(Image Classification Problem) Link
  • House Prices(Regression Problem) Link

How to approach any Machine Learning Problem(For Beginners trying out their first normal dataset)?

There will surely be a time when you might feel alienated as you see your first dataset. After all, this high amount of numbers will bewilder anyone. So, here are the steps which you need to follow when you see a heap of numbers and categories in a dataset in the near future so that you don't feel totally clueless about it. Consider these steps listed below as your harpoon to tame any given dataset:

  1. Data Exploration - Play around with the dataset, make some graphs and find relations between different columns. In all try to gather some conclusions out of it. You also need to see if it has some missing values and do some statistical observations like Scaling ,etc.This process is called as Data Exploration.

  2. Data Preparation - Now that you've had fun playing with the dataset. It's time to get to business. The missing values needs to be filled as the algorithms are very bad at dealing with missing data, just like any person when faced with incomplete information. The best way to fill it is using mean of the column as it has been observed to work well. Also, just filling it with the mean is not going to help. Take the knowledge of the data you explored into consideration and make the relevant groupings and fill the mean according to it. Along with this, an algorithm cannot deal with words so, change the words into numbers by mapping it to numbers. For example - If there is a column of gender which can be either male or female, then map it respectively to male - 1 and female - 0. Seems simple isn't it?. This process is known as Data Preparation.

  3. Data Cleaning - Now that you have explored and prepared the dataset, it's time to give the final touches to it so that the machine learning algorithm works well on it. The final step is to see if the values of each column in the same scale.For example - If a column is in the range of 1 - 2 while another one is in the range of 100-150,They are not in the same range right.So, The algorithm tends to be biased towards the higher range column. As this should not happen, you should bring the dataset to a same range by either scaling it yourself or using this link. Along, with this removal of extreme values in a column known as outliers link in statistical terms is also done. This marks the final preparation of the dataset. There are a few more steps which can be considered but as it is your first dataset you are good to go. This process in all is called as Data Cleaning. Its the most important part of this process.

  4. Algorithm Selection - Now that you have prepared the dataset. It's time to do the best part. Feed the dataset into the algorithm you have decided to use and Boom! You have made your first model in Machine Learning. Reiterate this step again and again. Play with it until you are satisfied with the results and think it cannot be further improved.

This are the main steps which you need to follow while approaching a normal dataset. Other kind of datasets like images,text,etc have the same process but separate way to deal with each of the step stated which will be updated later.

More information coming soon.

machine-learning-resource-box's People

Contributors

kolharsam avatar rahulmoorthy19 avatar atharva-bhagwat avatar

Watchers

James Cloos avatar RuchaTambe avatar  avatar  avatar

machine-learning-resource-box's Issues

Addition of Simple Code Samples

Shouldn't we be adding simple code samples on this repo?

Just to give everyone a brief overview of what happens when a certain piece of code is written.

To do or not to do?

I was thinking we open a GitBook project for this repository so that we can actually support the purpose of this repository in a complete manner by providing some reading material of our own and also problem sets or exercises that we think might challenge and engage people who are going to use our repo at a better level?

Comment what you think?

  • I've already made the organization on GitBooks you can join using Do Not click this link yet this

  • The book is live here

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.