Git Product home page Git Product logo

sebc's People

Contributors

mfernest avatar

Watchers

 avatar

sebc's Issues

Kerberize cluster

  • Create the Issue Kerberize cluster
  • Assign the issue to yourself and label it started
  • Install an MIT KDC on the same node as the MySQL server
    • Name your realm after your GitHub handle
    • Use CO.UK as a suffix
    • For example: ERNEST.CO.UK
  • Create Kerberos principals for ernest, siwicki, and cloudera-scm
    • Give cloudera-scm the privileges needed to create principals and keytabs
  • Enable Kerberos for the cluster
  • Run the terasort program as ernest using /user/ernest/tgen512m
    • Copy the command and output to challenges/labs/5_terasort.md
  • Run the Hadoop pi program as the user siwicki
    • Copy the command and output to challenges/labs/5_pi.md
  • Copy only text files in /var/kerberos/krb5kdc/ to your repo as follows:
    • Add the prefix 5_ and the suffix .md
    • Example: 5_kdc.conf.md
  • Push this work to your GitHub repo and label the Issue submitted
  • Assign the issue to both instructors

Welcome to SEBC!

Your repository setup appears complete and correct.

Please add a comment here with your name. Although some GitHub handles are easy to decipher, enough people use avatar names that we want to be sure of the match.

Regards,

Rob & Michael

Install MySQL

  • Create the Issue Install MySQL
  • Assign the Issue to yourself and label it started
  • Install a MySQL 5.5.x server on the node you listed first
    • Use the YUM repository available at dev.mysql.com
    • Copy /etc/yum.repos.d/mysql-community.repo to challenges/labs/1_mysql-community.repo.md
  • On all cluster nodes
    • Install the MySQL client package and JDBC connector jar
  • Start the mysqld service
  • Create the following databases
    • scm
    • rman
    • hive
    • oozie
    • hue
    • sentry
  • Put the following in the file challenges/labs/1_mysql.md
    • The hostname of your MySQL node
    • The command and output for mysql --version
    • The command and output for listing MySQL databases
  • Push this work to your GitHub repo
  • Label the Issue 'submitted` and assign it to both instructors

Challenges Setup

  • Create the Issue Challenges Setup
  • Make you have both mfernest, rsiwicki as Collaborators
  • Assign the Issue to yourself and label it started
  • In the file challenges/labs/0_setup.md:
    • List the cloud provider you are using (AWS, GCE, Azure, other)
    • List the Linux release you have chosen
    • Show that the disk space on each node is at least 30 GB
    • List the command and output for yum repolist enabled
  • Add the following Linux accounts to all nodes
    • User ernest with a UID of 2000
    • User siwicki with a UID of 3000
    • Create the group usa and add ernest to it
    • Create the group emea and add siwicki to it
  • List the /etc/passwd entries for ernest and siwicki in your setup file
  • List the /etc/group entries for usa and emea in your setup file
  • Push to your GitHub repo
  • Label your Issue submitted
  • Assign the Issue to both instructors

Hue Lab

  • Authenticate using Linux users/groups 0_unix_login.png
    • Issues, during synchronization, look @ comments for issue.
  • Lab: Security or Availability?
    • Integration of Hue and Sentry 1_hue_sentry.png
      • Reffering to previous exercise, where we declared two diff users with different Hive privileges, was the option to demonstrate integration of Hue and Sentry
    • Hue Load Balancing 1_hue_lb.md

Configure Sentry

  • Create the Issue Configure Sentry
  • Install and configure Sentry
  • Add ernest as a Sentry administrator
  • Login to beeline
    • Create an fce1 role that has all rights to the default database
      • Map the usa group to this role
    • Create a fce2 role that has SELECT privileges only to the sample tables in default
      • Map the emea group to this role
  • Login to beeline with the principal for ernest
    • List the result of SHOW TABLES; in challenges/labs/6_results.md
  • Login again to beeline as the principal for siwicki
    • List the result of SHOW TABLES; in the same file
  • Push this work to your GitHub repo and label the Issue submitted
  • Assign the issue to both instructors
  • Push all work to your GitHub repo

Test HDFS

  • Create the Issue Test HDFS
  • Assign the issue to yourself and label it started
  • As user ernest, use teragen to generate a 51,200,000-record dataset into six files
    • Set the block size to 32 MB
    • Name the target directory tgen512m
    • Use the time command to capture job duration
  • Put the following in the file challenges/labs/4_teragen.md
    • The full teragen command
    • The output of the time command
    • The command and output of hdfs dfs -ls /user/ernest/tgen512m
    • Show how many blocks are associated with this directory
  • Push this work to your GitHub repo and label the Issue submitted
  • Assign the issue to both instructors

Install CM

  • Create the Issue Install CM
  • Install a supported version of Java 8 on all machines
  • Assign yourself to the Issue and label it started
  • Install Cloudera Manager on a different node from MySQL
  • Configure the CM repo to install the 5.9 release
    • List the command and output of ls /etc/yum.repos.d in challenges/labs/2_cm.md
    • Copy the cloudera-manager.repo file to challenges/labs/2_cloudera-manager.repo.md
  • Configure Cloudera Manager
    • Use the scm_prepare_database.sh script to write your db.properties file
      • List the full command line in 2_cm.md
  • Start the Cloudera Manager server. Then in challenges/labs/2_db.properties.md:
    • Add the first line from your server log
    • Add the log line that contains the phrase "Started Jetty server"
    • Copy your db.properties file to challenges/labs/2_db.properties.md
  • Push to your GitHub repo and label the Issue 'submitted`
  • Assign the issue to both instructors

Security Lab

  • Implement TLS Level 1 Security
  • Security Labs Preparation
  • Integrating Kerberos with Cloudera Manager
  • Create a file kinit.md that includes for test user: kinit.md
    • kinit command
    • klist command
  • Sentry Lab
    • Install Sentry as a Service
    • Follow Sentry tutorial sentry-testmd

SEBC Evaluation: Pass

Thank you for your attendance and effort at the SEBC in Frankfurt last month. We are pleased to say you have passed the course based on your lab and challenge work. Please review the Issues comments for more details.

Once our Education department has verified your certification prerequisites are in order, you'll be issued a certificate for this work.

Good luck in your future engagements!

Install CDH

  • Create the Issue Install CDH
  • Assign the issue to yourself and label it started
  • Install the CDH 5.9 release; deploy coreset services only
    • Rename your cluster after your GitHub handle
  • Create user directories in HDFS for ernest and siwicki
  • Add the following to 3_cm.md:
    • Command and output for hdfs dfs -ls /user
    • The output from the CM API call ../api/v14/hosts
  • Login to Hue and install the Hive sample data
    • Capture the Hue home page to challenges/labs/3_hue_installed.png
  • Push this work to your GitHub repo and label the Issue submitted
  • Assign the issue to both instructors
  • Capture output of java -version and add it to 3_jv.md

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.