These are complete notes on Hands on practice, exercises and installation of hadoop related services based on the Hands on Hadoop Course including: HDFS, YARN, MapReduce, Pig, Hive, Ambari, Spark, Mesos, TEZ, HBase, Storm, Oozie, Flink, Scoop, Flume, Kafka, MySQL, Cassandra, MongoDB, Drill, Hue, Phoenix, Presto and Zeppelin.
The live notebook can be found at: axsauze.github.io/hadoop-overview
List of contents:
- Section 1
- Section 2
- 2-1 - HDFS: What is it and how it works
- 2-2 - Install Movielens dataset
- 2-3 - HDFS: Command line interface
- 2-4 - MapReduce Fundamental Concepts
- 2-5 - MapReduce on a Cluster - How MR Scales
- 2-6 - MapReduce: A Real Life Example
- 2-7 - Running MapReduce with MRJOB
- 2-8 - Running with MRJob
- 2-9 - Hadoop Your Challenge
- 2-10 - Check your results
- Section 3
- Section 4
- Section 5
- Section 6
- Section 7
- Section 8
- 8-1 - Hadoop under the hood
- 8-2 - TEZ Explained
- 8-3 - Configure and run hive
- 8-4 - Apache Mesos
- 8-5 - Zookeeper Overview
- 8-6 - Simulating Zookeeper Master Failure
- 8-7 - Oozie
- 8-8 - Workflow in Oozie
- 8-9 - Zeppelin Overview
- 8-10 - Playing with Zepplin
- 8-11 - Zeppelin Advanced
- 8-12 - HUE
- 8-13 - Other Admin Technologies
- Section 9
- Section 10
- Section 11
- Section 12