Git Product home page Git Product logo

hadoop-hdfs-mr-multi-node-cluster-aws-ansible's Introduction

Hadoop HDFS & MapReduce Multi Node Cluster Setup on AWS EC2 Instances using Ansible Automation

Let's see the problem Statement :

  1. Create Ansible Role to launch 9 AWS EC2 Instances.
  2. Dynamically fetch the IPs & create the Inventory to run the further Ansible Roles on those Instances.
  3. Create Role to configure Hadoop Name Node (Master), Data Node (Worker), Job Tracker Node, Task Tracker Node & Client Node.
  4. Finally configure 1st & 2nd & 3rd Instance as Name Node, Job Tracker & Client Node, also configure other 3 systems as Data Node & another 3 as Task Tracker.

Video Demonstration : https://bit.ly/3tICiLd

How to do this practical on your system :

  • Install Ansible v 2.10 on your local linux system.

  • Next clone this repository & go inside the folder "hadoop-ws". This is our workspace & it contain everything.

  • In this workspace we need to put two files - hadoop_instance.pem file & cred.yml file.

  • Now this "hadoop_instance.pem" file, you need to create on your AWS Account & then download the file in your Workspace - "hadoop-ws".

  • Next run chmod 400 hadoop_instance.pem to secure your AWS key pair from other user on your linux system.

  • Next run ansible-vault create cred.yml & it will open the vi editor on your linux system. So here put your AWS access key & secret key in YAML format.

This file data should look like

access_key : ABCDEFGHIJK
secret_key : abcdefghijk12345

  • Next go to "hadoop-ws/roles/ec2/vars/" folder & edit the "main.yml" file. Here you only just need to change the "subnet_name" variable with your "AWS account subnet id".

  • Note : As I am using AWS default VPC, that's why I haven't mentioned that on my "hadoop-ws/roles/ec2/tasks/main.yml" file. But if you want to use your own created VPC, then you need to put that extra option here.

Finally it's time to Deploy this whole setup, For that run - ansible-playbook setup.yml --ask-vault-pass & provide your vault (cred.yml) password & see the magic of Ansible.

hadoop-hdfs-mr-multi-node-cluster-aws-ansible's People

Contributors

raktim00 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.