A multi-container Docker application for running Hadoop commands inside a Jupyter notebook.
This application links a jupyter/datascience-notebook container to a cloudera/quickstart container to allow the user to run Hadoop jobs from a Jupyter notebook. This application was originally constructed for use in UC Berkeley's MIDS Machine Learning at Scale course (w261).
$ cd /path/to/hadoop-notebook
$ docker-compose build
$ docker-compose up -d
- Get the cloudera.quickstart IP Address
$ docker inspect hadoopnotebook_quickstart.cloudera_1 # this is the name of the cloudera container
- Add the following line to your /etc/hosts file (substituting the quickstart.cloudera IP address from above).
# /etc/hosts
...
<CLOUDERA.QUICKSTART IP ADDRESS> cloudera.quickstart