Git Product home page Git Product logo

amia2019_w07's Introduction

AMIA 2019 Informatics Summit

Workshop W07: Hands-On Full Life Cycle Data Science Workshop

Presenters: Lisiane Pruinelli, Steve Johnson and Tamara Winden

Note: These instructions are also available (with screenshots) in the "Workshop W07 - Instructions to Install Jupyter Notebook.pdf"

Thank you for your interest in the workshop. This is a “hands-on” workshop which means that you will need to bring a laptop so that you can follow along, execute code and do short exercises as we work through aspects of a data science project. You can either 1) run it in the cloud or 2) install the environment on your laptop and execute it locally.

  1. The advantage to running the environment in the cloud is that there is nothing to install! We will be using the Google Colab environment. One disadvantage is that the environment will shut down after 12 hours of use. That is fine for the purpose of this workshop, but probably is not what you’d want to use for a production data science environment.

  2. The advantage to installing it locally is that everything you need for a data science project is available on your computer. The disadvantage is that it is sometimes difficult and time consuming to install all of the elements of the environment. To install it locally, we will use the Anaconda Python distribution which has all of the packages that we will be using. It should install equally well on with Windows or Mac machines.

1. Access Jupyter notebooks in the cloud using Google Colab

This is probably the easiest way to get started. You can access Jupyter notebooks in a shared cloud environment using a Google research project called Colab. This has the advantage of letting you try out a notebook without having to install anything on your local computer. But the disadvantage is that the session will timeout after 12 hours. It is not suitable for doing real data science work, but it is a good option for this workshop.

To access the system, go to the following URL in a browser (Chrome works best): https://colab.research.google.com/ This will launch a new virtual environment where you can play with the workshop notebooks. Make sure you have a Google login and are logged into the Colab service. Check to make sure the environment is working by doing the following:

  1. When you go https://colab.research.google.com, a pop-up screen will ask you which notebook to open. Click the “GITHUB” tab and then put in joh06288 for the Github URL and find the AMIA2019_W07 repository and select the “Check_Environment” notebook:

  2. Navigate to the main menu and select “Runtime -> Run all”. You will be warned that the notebook is not from Google and that your runtimes will be reset. Pick “Run anyway” and “Yes” to run the notebook. If

  3. The final message should say “Congratulations, your environment is setup correctly!”

  4. If you have problems, there will be time at the Workshop to help you troubleshoot. Please come to the workshop meeting room 30 minutes early on the day of the workshop and we can help resolve the problem.

  5. If you are interested, you can take a look at the “Intro_to_Jupyter” notebook which is a quick overview of some of the libraries that we will be using during the workshop.

2. Install locally using the Anaconda distribution

If you don’t want to use the cloud, follow the instructions to install the full Anaconda Distribution (https://www.anaconda.com/download), which includes Python, the Jupyter Notebook system, and other commonly used packages for scientific computing and data science. Make sure you install the latest Python 3 version of the distribution. After a successful install, you can run the following command at the Terminal (Mac/Linux) or Command Prompt (Windows) to start the Jupyter system:

jupyter notebook

You can also launch the Jupyter Notebook system from the Anaconda Navigator.

Check to make sure the environment is installed correctly by doing the following:

  1. In a CMD prompt (Windows) or Terminal prompt (OSX) create a work directory that you will put the Workshop content in.

mkdir workdir

  1. Retrieve the workshop content from the github server using the following command:

git clone https://github.com/joh06288/AMIA2019_W07.git

  1. From the browser window that Jupyter notebook opened for you when it started, navigate to your workdir and go into the AMIA2019_W07 folder.

  2. Select the “Check_Environment” notebook. A new browser window will open up.

  3. Select “Cell -> Run All” from the Jupyter Notebook menu.

  4. The final message should say “Congratulations, your environment is setup correctly!”

  5. If you have problems, there will be time at the Workshop to help you troubleshoot. Please come to the workshop meeting room 30 minutes early on the day of the workshop and we can help resolve the problem.

  6. If you are interested, you can take a look at the “Intro_to_Jupyter” notebook which is a quick overview of some of the libraries that we will be using during the workshop.

amia2019_w07's People

Contributors

joh06288 avatar stevejohnson2001 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.