Training Cloudera

Training goals dlearning

code: CL-ATAH

This four-day administrator training course for Apache Hadoop provides participants with a comprehensive understanding of all the steps necessary to operate and maintain a Hadoop cluster. From installation and configuration through load balancing and tuning, Cloudera’s training course is the best preparation for the real-world challenges faced by Hadoop administrators.

Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning topics such as:

  • The internals of YARN, MapReduce, and HDFS
  • Determining the correct hardware and infrastructure for your cluster
  • Proper cluster configuration and deployment to integrate with the data center
  • How to load data into the cluster from dynamically-generated files using Flume and from RDBMS using Sqoop
  • Configuring the FairScheduler to provide service-level agreements for multiple users ofa cluster
  • Best practices for preparing and maintaining Apache Hadoop in production
  • Troubleshooting, diagnosing, tuning, and solving Hadoop issues

Conspect Show list

  1. Introduction
  2. The Case for Apache Hadoop
    • Why Hadoop?
    • Core Hadoop Components
    • Fundamental Concepts
  3. HDFS
    • HDFS Features
    • Writing and Reading Files
    • NameNode Memory Considerations
    • Overview of HDFS Security
    • Using the Namenode Web UI
    • Using the Hadoop File Shell
  4. Getting Data into HDFS
    • Ingesting Data from External Sources with Flume
    • Ingesting Data from Relational Databases with Sqoop
    • REST Interfaces
    • Best Practices for Importing Data
  5. YARN and MapReduce
    • What Is MapReduce?
    • Basic MapReduce Concepts
    • YARN Cluster Architecture
    • Resource Allocation
    • Failure Recovery
    • Using the YARN Web UI
    • MapReduce Version 1
  6. Planning Your Hadoop Cluster
    • General Planning Considerations
    • Choosing the Right Hardware
    • Network Considerations
    • Configuring Nodes
    • Planning for Cluster Management
  7. Hadoop Installation and Initial Configuration
    • Deployment Types
    • Installing Hadoop
    • Specifying the Hadoop Configuration
    • Performing Initial HDFS Configuration
    • Performing Initial YARN and MapReduce Configuration
    • Hadoop Logging
  8. Installing and Configuring Hive, Impala and Pig
    • Hive
    • Impala
    • Pig
  9. Hadoop Clients
    • What is a Hadoop Client?
    • Installing and Configuring Hadoop Clients
    • Installing and Configuring Hue
    • Hue Authentication and Authorization
  10. Cloudera Manager
    • The Motivation for Cloudera Manager
    • Cloudera Manager Features
    • Express and Enterprise Versions
    • Cloudera Manager Topology
    • Installing Cloudera Manager
    • Installing Hadoop Using Cloudera Manager
    • Performing Basic Administration Tasks Using Cloudera Manager
  11. Advanced Cluster Configuration
    • Advanced Configuration Parameters
    • Configuring Hadoop Ports
    • Explicitly Including and Excluding Hosts
    • Configuring HDFS for Rack Awareness
    • Configuring HDFS High Availability
  12. Hadoop Security
    • Why Hadoop Security Is Important
    • Hadoop’s Security System Concepts
    • What Kerberos Is and How it Works
    • Securing a Hadoop Cluster with Kerberos
  13. Managing and Scheduling Jobs
    • Managing Running Jobs
    • Scheduling Hadoop Jobs
    • Configuring the FairScheduler
    • Impala Query Scheduling
  14. Cluster Maintenance
    • Checking HDFS Status
    • Copying Data Between Clusters
    • Adding and Removing Cluster Nodes
    • Rebalancing the Cluster
    • Cluster Upgrading
  15. Cluster Monitoring and Troubleshooting
    • General System Monitoring
    • Monitoring Hadoop Clusters
    • Common Troubleshooting Hadoop Clusters
    • Common Misconfigurations
  16. Conclusion
Download conspect training as PDF

Additional information

  • This course is best suited to systems administrators and IT managers who have basic Linux experience.
  • Prior knowledge of Apache Hadoop is not required.
Difficulty level
Duration 4 days

The participants will obtain certificates signed by Cloudera. This course helps prepare for a Cloudera Certified Administrator for Apache Hadoop (CCAH) certification exam Certification is a great differentiator; it helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise.


Certified Cloudera Instructor.

Cloudera show more courses
Training thematically related

Big Data

Contact form

Please fill form below to obtain more info about this training.

* Fields marked with (*) are required !!!

Information on data processing by Compendium - Centrum Edukacyjne Spółka z o.o.

2180 EUR


Discount codes

Discount code may refer to (training, producer, deadline). If you have a discount code, enter it in the appropriate field.
(green means entering the correct code | red means the code is incorrect)



Traditional training

Sessions organised at Compendium CE are usually held in our locations in Kraków and Warsaw, but also in venues designated by the client. The group participating in training meets at a specific place and specific time with a coach and actively participates in laboratory sessions.

Dlearning training

You may participate from at any place in the world. It is sufficient to have a computer (or, actually a tablet, or smartphone) connected to the Internet. Compendium CE provides each Distance Learning training participant with adequate software enabling connection to the Data Center. For more information, please visit site



Electronic materials

Electronic Materials: These are electronic training materials that are available to you based on your specific application: Skillpipe, eVantage, etc., or as PDF documents.

Ctab materials

Ctab materials: the price includes ctab tablet and electronic training materials or traditional training materials and supplies provided electronically according to manufacturer's specifications (in PDF or EPUB form). The materials provided are adapted for display on ctab tablets. For more information, check out the ctab website.



No deadlines for this training.

Suggest your own appointment

Upcoming Cloudera training

Training schedule Cloudera