Git Product home page Git Product logo

intro-to-sklearn's Introduction

What's in this tutorial

The notebooks are a modular introduction to machine learning in python using scikit-learn with examples and tips.

The material is in jupyter notebook format and was designed to be compatible with Python >= 2.6 or >= 3.3. To use these notebooks interatively (intended use), you will need a jupyter/ipython notebook install (see below).

Also, included is a brief introductory guide to jupyter notebooks in Notebook_anatomy notebook. If you are unfamiliar with jupyter/ipython notebooks, please take some time to look at this file.

Installation Notes

For a quick deployment, simply click the launch binder link at the bottom of this page. However, we recommend a local install for more customizable setups, flexibility and possiblities.

Setting up a development environment

Note: the requirements.txt file above is a snapshot of the latest pip installed packages from a successful ML ecosystem. conda should install the best dependencies for the scikit-learn used and may have different versions.

It is generally best practice to have a distinct development environment for various Python projects. There are multiple options available to do this such as virtualenv and Conda. For this project, we will be using the Conda environment.

To get started, you can install miniconda3 to get python3 as well as python2.

If you already have Python installed, you can install Conda via pip:

pip install auxlib conda

Initializing a Conda environment

  • To setup a python 2.7 development environment in addition to your python 3 conda install for this project (done after installing miniconda3), you can run:

    • conda create --name sklearn python=2
    • This installs into C:\Miniconda3\envs\python2\ so I added this to system path (on Windows)
    • On Linux and OS/X, this depends on where the Python Framework is installed. On OS/X using Homebrew, this installs into /usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/envs/python2/bin
    • See here for more detailed instructions
  • To activate the development environment, from the bin folder of your conda environment, run

    • Windows: activate sklearn
    • Linux/OSX: source activate sklearn
  • Ensure ipython/ipython2 is installed in the Python environment

    • Windows: c:\Miniconda3\envs\python2\Scripts\ipython2.exe kernel install --name python2 --display-name "Python 2"
    • Linux/OSX: ipython2 kernel install --name python2 --display-name "Python 2" (may need sudo)
  • If, at any point, you desire to exit the development environment, simply type the following:

    • Windows: deactivate
    • Linux/OSX: source deactivate

Installing jupyter notebook locally

The easiest way to install jupyter notebook is via conda install

  • Run conda install jupyter from your terminal. Linux/OSX may require sudo permissions.
  • Navigate to the directory containing this repository, and execute jupyter notebook. This will start a notebook service locally for accessing notebooks in your browser. Drill down on the home page to your notebook of interest.

For a notebook primer go to Notebook_anatomy.ipynb on this repo. The very short story is: to execute a cell just hit Shift-Enter. There are many more shortcuts in primer.

Installing python packages

This tutorial requires the following packages:

You can use your development environment of choice, but if you used conda as described above, simply run:

	$ conda install numpy scipy matplotlib scikit-learn jupyter

We have also provided a requirements.txt file above for use with pip.

Other install options

There are many different ways to install python and the package ecosystem for machine learning. They are not all going to be covered here, but essentially you have the following choices:

  1. anaconda/miniconda aka conda (shown above)
  2. download python and pip install packages
  3. use a docker image (this is one for jupyter+sklearn+skflow+tensorflow)
  4. Google cloud platform has a jupyter notebook service called Datalab (quickstart here). It has tensorflow pre-installed (needed for next tutorial).
  5. Click the Binder link at the bottom of this page to deploy a notebook setup.

Or a combination of the above.

A quick tip if you are installing in a non-conda way with pip and are on Windows, many of the data analysis packages are tricky (compiled dependencies) to install. A nice "unofficial" repository for binaries of packages like numpy and a myriad of others was created and maintained by Christoph Gohlke. This site is here.

What's next

The next tutorial in this workshop is on tensorflow and the installation instructions are in this README

Binder

intro-to-sklearn's People

Contributors

giricme avatar karthikrangarajan avatar michhar avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.