About the Course
We take this opportunity to introduce ourselves as ‘GITS’ a Leading name in Big data analytics. We
provide training on Apache hadoop, Cassandra, NoSQL.
We offer hdoop training/ workshop from highly experienced industry innovator with 15years
experienced in big data analytics.
Training contents have been developed based on real-world scenarios, interactive and hands-on and
are scheduled to meet the demands of working professionals.
We present real world scenario-based training developed by the software architects and builders of
highly scalable solutions based on Apache Hadoop with unmatched depth and expertise so that you
can be assured you are learning from the experts. We offer the following courses designed for
software developers, architects and cluster administrators.
Topics CoveredCourse Outline:
What is Big Data & Why Hadoop?
• What is Big data?
• Shift in Data and Analysis.
• Challenges for Organizations.
• Understanding the Data Characteristics.
• Comparing Information Architecture Operational Paradigms.
• Why Hadoop?
• How Hadoop is helping industry?
• Introduction to tow major components of Hadoop – HDFS and Map Reduce
• List of tools and Software based Hadoop Eco System
• Environment Setup Up and introduction to available learning resources on Web though different vendors.
• Distributed File System and Hadoop
• What Hadoop Distributed File System does?
• Assumption and Goals
• Key Features supported by HDFS
• HDFS Architecture – Master /Slave Architecure
• NameNode and DataNodes
• The File System Namespace
• The Persistence of File System Metadata
• The Communication Protocols
• Data Organization in HDFS
• Motivation for MapReduce (why)
• What is Map Reduce?
• Functional Abstractions Hide Parallelism
• How MapReduce Works- Master Slave Architecture
• Job Tracker and Task Tracker (in detail)
• MapReduce: The Map Step
• MapReduce: The Reduce Step
• How Map and Reduce Work Together along with other utilities provided by Apache Hadoop Specification
• Elementary Example Hands On: Counting Words
• Data Distribution
• Fault Tolerance
NO SQL: NOT ONLY SQL
• Why NO-SQL?
• What is NO-SQL?
• From Relational DB to NO-SQL
• Relational DB v/s NO-SQL
• What makes NO –SQL a ‘big-hit?
• Four Major Variants of NO-SQL
• Key-Value Pair Data bases
o Mechanism behind Key-Value Pair
o Consistency and Performance of Key – Value Pair Data bases.
• Column- Oriented DBs
o How does it Colum oriented DB work?
o Architecture and mechanism
• Document Based DBs
o Characteristics and Mechanism of Document Based DBs
• Graph-Based DBs
o Characteristics and Mechanism of Graph-Based DBs
• When to choose what among these four types?
Hadoop Eco System-
Following in detail –
Architecture, Functionality and Hands on
o What is HBase?
o Why HBase? :Benefits
o Relational DB v/s HBase
o HBase : Data Model and Architecture
o File System for Hbase and HDFS
o Syntax supported by HBase : Hands On
o Hive Components
o Data Model
o Hive: Example and Hands On
• Integration(Hive + Hbase):
o How it works?
o Hive Components
o Data Model
o PIG : Example and Hands On
Who should attendAny graduated and master, having knowledge in Java ,Linux, Unix, or any programming language..
What you need to bringnotebook, PC,
Key Takeaways• Learning Pig and Pig Latin
• Learning Hive, distributed data warehousing system
• Learning HBase, distributed column based database for Hadoop
• Building applications based on real-world examples
• Case studies of real-world uses of Big Data
• Big Data technologies
• Panel discussion