Introduction to Hadoop

Learn how to solve big data problems with Hadoop. We’ll start with Hadoop from the ground up, and cover the Hadoop architecture, Distributed File System, Map-Reduce, writing & running Hadoop jobs, operations, and the Hadoop eco-system.

Students will learn how to create and run Hadoop jobs, what types of problems are good (and bad) candidates for Hadoop-based solutions, and alternatives to writing Hadoop code.

Who Should Attend?

This course is for Java developers who want to know how and when to use Hadoop to solve Big Data problems.

Prerequisites

This course assumes no prior knowledge of Hadoop, though participants should be comfortable reading and writing Java code; familiarity with Bash will help.

Outline

  • Overview – The How & Why of Hadoop
  • HDFS – Distributed Storage at Scale
  • Map-Reduce – Divide & Conquer
  • Writing Hadoop Jobs
  • Running Hadoop Jobs
  • Hands-on Lab – Your First Hadoop Job
  • Hadoop Operations – Configuring & Monitoring a Cluster
  • Hadoop Eco-system – Hive, HBase, Pig, Cascading and more
  • Summary