Here's a repository that you always wanted! We've compiled a list of resources that WE think can aid you best in your learning of Machine Learning.
It's the cutting edge of Computer Programming. You write programs that allow software applications to become more accurate in predicting systems outcomes without being explicitly programmed.Fun isn't it?
- Anaconda Prompt Link
- After installing prompt enter the following commands:
pip install pandas
pip install numpy
pip install skypy
pip install seaborn
pip install matplotlib
pip install sklearn
Once you have the basics down, ramp up your skills by applying ML techniques to big data-sets in real-world applications. Other paths take you further into data science, and innovative ML approaches like deep learning,Reinforcement Learning and Generational Neural Networks.
- Deep Learning Specialization Link
- Introduction to Reinforcement Learning (Google DeepMind) Link
- Stanford Lectures Link
As most of the Courses are done on Research Software like R, Octave.These books can help you do the same on Python which is a more popular and commonly used machine learning language
- Sebastian Raschka python machine learning-This is one of the most referred book for Machine Learning Concepts and Implementation.
Kaggle is a great place for hunting datasets but you cannot obviously start working with any dataset right off the bat so here are some particular datasets which can help you test your Machine Learning Skills if you just starting up with it.
- Titanic Dataset(Classification Problem) Link
- Iris Dataset(Classification Problem) Link
- MNIST Dataset(Image Classification Problem) Link
- House Prices(Regression Problem) Link
There will surely be a time when you might feel alienated as you see your first dataset. After all, this high amount of numbers will bewilder anyone. So, here are the steps which you need to follow when you see a heap of numbers and categories in a dataset in the near future so that you don't feel totally clueless about it. Consider these steps listed below as your harpoon to tame any given dataset:
-
Data Exploration - Play around with the dataset, make some graphs and find relations between different columns. In all try to gather some conclusions out of it. You also need to see if it has some missing values and do some statistical observations like Scaling ,etc.This process is called as Data Exploration.
-
Data Preparation - Now that you've had fun playing with the dataset. It's time to get to business. The missing values needs to be filled as the algorithms are very bad at dealing with missing data, just like any person when faced with incomplete information. The best way to fill it is using mean of the column as it has been observed to work well. Also, just filling it with the mean is not going to help. Take the knowledge of the data you explored into consideration and make the relevant groupings and fill the mean according to it. Along with this, an algorithm cannot deal with words so, change the words into numbers by mapping it to numbers. For example - If there is a column of gender which can be either male or female, then map it respectively to male - 1 and female - 0. Seems simple isn't it?. This process is known as Data Preparation.
-
Data Cleaning - Now that you have explored and prepared the dataset, it's time to give the final touches to it so that the machine learning algorithm works well on it. The final step is to see if the values of each column in the same scale.For example - If a column is in the range of 1 - 2 while another one is in the range of 100-150,They are not in the same range right.So, The algorithm tends to be biased towards the higher range column. As this should not happen, you should bring the dataset to a same range by either scaling it yourself or using this link. Along, with this removal of extreme values in a column known as outliers link in statistical terms is also done. This marks the final preparation of the dataset. There are a few more steps which can be considered but as it is your first dataset you are good to go. This process in all is called as Data Cleaning. Its the most important part of this process.
-
Algorithm Selection - Now that you have prepared the dataset. It's time to do the best part. Feed the dataset into the algorithm you have decided to use and Boom! You have made your first model in Machine Learning. Reiterate this step again and again. Play with it until you are satisfied with the results and think it cannot be further improved.
This are the main steps which you need to follow while approaching a normal dataset. Other kind of datasets like images,text,etc have the same process but separate way to deal with each of the step stated which will be updated later.
More information coming soon.