Machine Learning Resource Box

Here's a repository that you always wanted! We've compiled a list of resources that WE think can aid you best in your learning of Machine Learning.

What is Machine Learning?

It's the cutting edge of Computer Programming. You write programs that allow software applications to become more accurate in predicting systems outcomes without being explicitly programmed.Fun isn't it?

Get an understanding here:

AndrewNg Machine Learning Course Link
OR
Google's Machine Learning Crash Course Link

Software Requirements:

Anaconda Prompt Link
After installing prompt enter the following commands:
- pip install pandas
- pip install numpy
- pip install skypy
- pip install seaborn
- pip install matplotlib
- pip install sklearn

What can you do after learning ML?

Once you have the basics down, ramp up your skills by applying ML techniques to big data-sets in real-world applications. Other paths take you further into data science, and innovative ML approaches like deep learning,Reinforcement Learning and Generational Neural Networks.

Any of these:

Deep Learning Specialization Link
Introduction to Reinforcement Learning (Google DeepMind) Link
Stanford Lectures Link

The Books to assist in moving forward?

As most of the Courses are done on Research Software like R, Octave.These books can help you do the same on Python which is a more popular and commonly used machine learning language

Any of these:

Sebastian Raschka python machine learning-This is one of the most referred book for Machine Learning Concepts and Implementation.

Which Datasets will help you test your Machine Learning Skills?

Kaggle is a great place for hunting datasets but you cannot obviously start working with any dataset right off the bat so here are some particular datasets which can help you test your Machine Learning Skills if you just starting up with it.

Any of these:

Titanic Dataset(Classification Problem) Link
Iris Dataset(Classification Problem) Link
MNIST Dataset(Image Classification Problem) Link
House Prices(Regression Problem) Link

How to approach any Machine Learning Problem(For Beginners trying out their first normal dataset)?

There will surely be a time when you might feel alienated as you see your first dataset. After all, this high amount of numbers will bewilder anyone. So, here are the steps which you need to follow when you see a heap of numbers and categories in a dataset in the near future so that you don't feel totally clueless about it. Consider these steps listed below as your harpoon to tame any given dataset:

Data Exploration - Play around with the dataset, make some graphs and find relations between different columns. In all try to gather some conclusions out of it. You also need to see if it has some missing values and do some statistical observations like Scaling ,etc.This process is called as Data Exploration.
Data Preparation - Now that you've had fun playing with the dataset. It's time to get to business. The missing values needs to be filled as the algorithms are very bad at dealing with missing data, just like any person when faced with incomplete information. The best way to fill it is using mean of the column as it has been observed to work well. Also, just filling it with the mean is not going to help. Take the knowledge of the data you explored into consideration and make the relevant groupings and fill the mean according to it. Along with this, an algorithm cannot deal with words so, change the words into numbers by mapping it to numbers. For example - If there is a column of gender which can be either male or female, then map it respectively to male - 1 and female - 0. Seems simple isn't it?. This process is known as Data Preparation.
Data Cleaning - Now that you have explored and prepared the dataset, it's time to give the final touches to it so that the machine learning algorithm works well on it. The final step is to see if the values of each column in the same scale.For example - If a column is in the range of 1 - 2 while another one is in the range of 100-150,They are not in the same range right.So, The algorithm tends to be biased towards the higher range column. As this should not happen, you should bring the dataset to a same range by either scaling it yourself or using this link. Along, with this removal of extreme values in a column known as outliers link in statistical terms is also done. This marks the final preparation of the dataset. There are a few more steps which can be considered but as it is your first dataset you are good to go. This process in all is called as Data Cleaning. Its the most important part of this process.
Algorithm Selection - Now that you have prepared the dataset. It's time to do the best part. Feed the dataset into the algorithm you have decided to use and Boom! You have made your first model in Machine Learning. Reiterate this step again and again. Play with it until you are satisfied with the results and think it cannot be further improved.

This are the main steps which you need to follow while approaching a normal dataset. Other kind of datasets like images,text,etc have the same process but separate way to deal with each of the step stated which will be updated later.

More information coming soon.

kolharsam / machine-learning-resource-box Goto Github PK