Banner

Self-Paced Big Data Hadoop Administrator Training

Duration: 36 Hours
Pattern figure

Overview

Become an expert Hadoop Administrator by getting your hands-on Hadoop Clusters, including monitoring the Hadoop Distributed File System and Planning & Deployment. The course will also take a hands-on approach to the Hadoop Ecosystem, which consists of YARN, Map Reduce, HDFS, Cloudera Manager, Hadoop Cluster with Hive, HBase, Pig, Flume, and RDBMS using Sqoop.

Become a Hadoop Administrator by mastering Hadoop Clusters! Cognixia’s Big Data Hadoop Administrator course is specifically designed to provide a hands-on experience to install, configure, and manage the Apache Hadoop platform.

Curriculum

By the end of the module, the student will be able to understand the basics of big data, and will have the foundation of Hadoop daemons and Hadoop architecture.
  • a.Understanding Big Data Basics b. Big Data Use Cases c. Introduction to Hadoop d. Understanding Hadoop Ecosystem e. Introduction to HDFS
  • a. Introduction to Namenode b. Introduction to Datanode
  • a. Introduction to Secondary Namenode
  • a. Introduction to MapReduce
  • a. Introduction to JobTracker b. Introduction to TaskTracker
  • a. Summarizing Hadoop Architecture b. Roles and Responsibilities of a Hadoop Administrator

By the end of the module, the student will be able to create a multi-node Hadoop cluster. Preparing students to create Hadoop clusters, this module gives a deep understanding of how Linux works, how to setup virtual machines, and how to set up the password-less SSH.
  • Linux internals
  • i. Commands that are required ii. Linux basics
  • Hadoop Cluster Installation Pre-requisites
  • Pre-requisites of Hadoop Installation
  • i. Software Downloads ii. Preparing your VM iii. Enabling VM with VMware iv. Understanding mandatory changes in the operating system
  • Installation and Configuration
  • i. Understanding Hadoop cluster installation modes ii. Understanding Hadoop Version 1 installation and configuration iii. Password-less SSH setup
  • Hands-On Practice for creating a Hadoop cluster
  • Helping individually in practicing Hadoop cluster installation

  • By the end of the module, the student will be able to understand how to plan a production cluster of Hadoop. Students will understand the hardware and software requirements of a Hadoop cluster, performance tuning after cluster creation, and benchmarking.

By the end of the module, the student will be able to administrate a Hadoop cluster. Students will understand how to copy data from one Hadoop cluster to another Hadoop cluster, how to use different Hadoop schedulers to run jobs, how to perform backup and recovery of metadata, data, configurations, and application data, and how to recover cluster data.

By end of the module, the student will be able to understand how the next version of Hadoop and YARN works. An understanding of the new features of Hadoop Version 2 and Yarn framework will also be provided, and the knowledge to deploy a Hadoop 2 cluster in a pseudo-distributed and multi distributed mode.
  • i. Hadoop 2.0 new features ii. YARN
  • i. Understanding Resource Manager ii. Understanding Application Master iii. Understanding Node Manager iv. Understanding Hadoop 2 Job Execution Framework
  • Hadoop 2 Multi-node cluster creation
  • i. Pre-requisites of Hadoop Installation ii. Software Downloads iii. Preparing your VM iv. Enabling VM with VMware v. Understanding mandatory changes in the operating system vi. Installation and Configuration vii. Understanding Hadoop version 2 installation and configuration viii. Passwordless SSH setup

By the end of the module, the student will be able to learn how to achieve high availability, how to enable Federation in Namenode, and what the various improvements in Hadoop 2 are.
  • Practice Hadoop 2 Multi-node Cluster Creation
  • Helping individuals in practicing Hadoop 2 cluster installation
  • a. Sample Yarn Job execution c. Understanding Issues of Hadoop 1 d. Understanding improvements in Hadoop 2 e. Namenode Federation
  • Enable segregation of HDFS using multiple Namenodes
  • Namenode – High Availability
  • i. Achieving Namenode High-Availability using Quorum Journal Manager ii. Achieving Namenode High-Availability using Network File System
  • Implementation of NN High Availability
  • Helping individuals achieving Namenode High Availability

By end of the module, the student will be able to administrate the basics of Hadoop ecosystem components like Hive, Hbase, Sqoop, Flume, and Pig.
  • Hadoop Ecosystem Introduction
  • Understanding the integration of Hadoop ecosystem
  • Touchbase with Hive
  • What is Hive? ii. Architecture of Hive iii. Understanding Hive meta-store concepts
  • HBase
  • Understading HBase Basics ii. Understanding HBase storage Model iii. Understanding HBase Architecture iv. Cluster Installation and Configuration
  • Pig
  • What is Pig? ii. How Pig integrates with Hadoop cluster? iii. Demo of Pig Jobs using MapReduce
  • Sqoop
  • What is Sqoop? ii. How to import and export the data from Sqoop to RDBMS? iii. Example of Sqoop jobs using MySQL
  • Flume
  • What is F? ii. Sample Flume jobs

By the end of the module, the student will be able to build a multi-node Cloudera cluster using Cloudera Manager, will know how to achieve high availability, and how to add a new node into the cluster using Cloudera Manager.
  • Understanding the internals of Cloudera Manager a. Understanding the automation of Hadoop installation using Cloudera Manager b. Understanding Cloudera Hadoop Distribution and Cloudera Manager c. Understanding the underlying directory structure of Cloudera Hadoop d. Cloudera Hadoop Cluster Installation – CDH
waves
Ripple wave

FAQs

Yes, the course completion certificate is provided once you successfully complete the training program. You will be evaluated on parameters such as attendance in sessions, an objective examination, and other factors. Based on your overall performance, you will be certified by Cognixia.
waves

Interested in this Course?

    Ready to recode your DNA for GenAI?
    Discover how Cognixia can help.

    Get in Touch
    Pattern figure
    Ripple wave