The world of Hadoop and "Big Data" can be intimidating - hundreds of different technologies with cryptic names form the Hadoop ecosystem. With this Hadoop tutorial, you'll not only understand what those systems are and how they fit together - but you'll go hands-on and learn how to use them to solve real business problems!
Design distributed systems that manage "big data" using Hadoop and related technologies.
Choose an appropriate data storage technology for your application
Use HDFS and MapReduce for storing and analyzing data at scale.
Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Hue, and .
- Traditional way vs MapReduce way
- Why MapReduce
- YARN Components
- YARN Architecture
- YARN MapReduce Application Execution Flow
- YARN Workflow
- Anatomy of MapReduce Program
- Input Splits, Relation between Input Splits and HDFS Blocks
- MapReduce: Combiner & Partitioner
Intro to Bigdata and Hadoop
Hadoop 1.0
Hadoop 2.0
Yarn
Hdfs
Pig
Hbase
Hive
Hive vs Pig
Sqoop
Steaming Project Using Twitter
.