Big Data Tutorial

This module is for anyone who needs to know about big data – what it means and how to work with it.

Find out answers to questions about big data in general, and specific technologies used to solve big data problems, including map-reduce, distributed file systems, data processing workflows, scalable storage and real-time data processing.

This module draws on Ken Krugler’s extensive real world experience with Apple, Groupon, and many Bay area startups to not only explain what these technologies do, but more importantly how and when it makes sense to use them.

Who Should Attend?

Developers, managers, architects, or anyone who wants or needs to learn more about the complex and rapidly changing world of big data solutions.

Prerequisites

No prior knowledge of Hadoop, Cassandra, or other big data technologies is required.

Outline

  • A Real-World Problem – an analytics web site
  • A Real-World Solution – Hadoop & Solr
  • Critical Big Data Skills
  • Leveraging Amazon’s Cloud Infrastructure/li>
  • Hadoop in a Nutshell
  • Big Data Workflows with Cascading
  • NoSQL storage with Cassandra and HBase
  • Real time processing with Storm
  • Summary