Mastering Big Data with Hadoop and IBM BigInsights Training Course

Duration: 3 days
Course code: SS-BDA-006

  • Application developers
  • Architects
  • Consultants
  • Technical Managers

Experience with Java is strongly encouraged.


This course is designed to provide a rapid immersion into Big Data with Hadoop using IBM BigInsights, IBM's distribution of Hadoop. The course starts with clear explanation of key concepts of Map-Reduce algorithm beyond Hadoop and immediately becomes hands-on: the participants right away start using the tools. We start with an introduction to the Hadoop cluster and teach the ways to interact with the Hadoop file system and the cluster. Next we examine writing Java programs that perform the processing on the Hadoop cluster. We discuss writing of mappers and reducers, common algorithms, best practices and testing approaches. Next we introduce Hive and Pig – popular higher level interfaces to the Hadoop system.


Upon completion, attendees would be able to:

  • Understand the Hadoop technology and IBM BigInsights
  • Work with the Hadoop file system
  • Write Map Reduce programs using Hadoop APIs
  • Use Hive and Pig for productive development
  • Make the best use of IBM BigInsights tooling for Hadoop

Outline for Mastering Big Data with Hadoop and IBM BigInsights Training Course

Big Data and Hadoop: A quick dive

  • Big Data
  • Problems with conventional systems
  • Map Reduce algorithm
  • Traditional Database Applications
  • Hadoop
  • IBM BigInsights


  • What is MapReduce?
  • Relevance of MapReduce to Big Data
  • Map operation
  • Reduce operation
  • Survey of real-world Map Reduce problems
  • Execution strategies for MapReduce


  • What is Hadoop?
  • The Hadoop architecture
  • Hadoop Distributed File System

Hadoop Distributed File System (HDFS)

  • HDFS Architecture
  • Web interface
  • Command shell
  • Scalability
  • Data replication

Working with Hadoop API

  • Hadoop API
  • Mapper
  • Reducer
  • Combiner
  • JobConf
  • JobClient

Designing Hadoop Applications

  • Typical Hadoop algorithms
  • Best practices for Hadoop
  • Testing Hadoop programs

Working with Hive

  • What is Hive?
  • Hive architecture
  • Data warehouse using Hive
  • Hive QL
  • Plugging custom mappers and reducers

Working with Pig

  • What is Pig?
  • Pig architecture
  • Analyzing data using Pig
  • Using Pig Latin to build data analysis programs

Comments are closed.