UrbanPro
true

Apache Spark Development with Kafka, Spark Streaming, SparkML and Real-Time Applications

LIVE
5 reviews
60 Hours

Course offered by Navaneetha Babu Chellathurai

5 reviews

Hadoop Developer â?? Spark â?? 50 Hours

 

  • Course Introduction

 

  • Why Apache Hadoop?
    • Problem in Data Driven Businesses
    • How Hadoop Solves it and why Big Data Solutions
      • Hadoop Fundamental
    • What comprises of Hadoop, Subprojects and Ecosystem
      • Core Hadoop Components
      • Apache Subprojects
      • Hadoop Ecosystem

 

  • HDFS
    • HDFS Feature
    • HDFS Architecture â?? Non HA
    • HDFS Architecture â?? HA
    • Writing and Reading Files in HDFS
    • NameNode Memory and Load Handling
    • Basic HDFS Security
    • HDFS CLI
    • HDFS UIs
    • Other storage Technologies
    • Hands-on in writing, reading files with HDFS,  Permissions, Viewing Blocks and other basic HDFS Operations

 

  •      Introduction to Python
    • Introduction to Functional Programming
    • Features of Functional Paradigm
    • Variables, Control structures, Functions and Objects
    • Mutable and Immutable Data
    • First Class Functions
    • Strings, Tuples and Named Tuples
    • Lists, Dicts and sets
    • Lambda Functions
    • Hands-on on all the above

 

 

 

 

  • Mapreduce and YARN  - Basics
    • Why Computational Framework
    • YARN Architecture
    • MapReduce Architecture and Hands-on
    • Spark Architecture
    • How YARN executes MR and Spark jobs
    • How to see YARN Applications in WEB UIs and Shell
    • YARN Application Logs
    • Hands-on on all the above

 

  •    Importing RDBMS Data to Hadoop
    • Introduction to Apache Sqoop
    • Sqoop2 Architecture
    • Using Sqoop to import RDBMS Table to HDFS
    • Change the Delimiter and File Format of imported Tables
    • Control which columns to be imported
    • Sqoop Performance improvement
    • Sqoop Hands-on

 

  • Capturing Data with Flume
    • Introduction to Apache Flume
    • Flume Architecture
    • Flume Source, Sink, Channel
    • Flume Configurations
    • Ingesting weblogs using Flume
    • Hands-on â?? Flume Data Ingestion scenarios

 

  • Data Model and DWH using Hive and Impala/Tez
    • Introduction to Hive
    • Introduction to Impala/Tez
    • How to query Hive and Impala/Tez
    • How Hive and Impala/Tez differs RDBMS
    • Usage of Hive Metasore by Hive and Impala
    • HiveQL and Impala SQL for query operations
    • Managed and External Tables
    • Introduction to Hue
    • Create Tables using Hue
    • Load Data using Hive, impala and sqoop import to Hive tables
    • Hive, Impala/Tez Hands

 

  • Hadoop Data Formats
    • Introduction to Data Formats
    • Various Data Formats
    • Introduction to AVRO
    • Parquet
    • Evolution of Avro Schema â?? Compatabilities
    • Extracting Metadata and data from AVRO data file
    • Using AVRO with hive, sqoop
    • Using Parquet with hive, sqoop
    • Compressions
    • Hands on AVRO

 

  • Data Partitioning Concepts
    • Overview of Partitions
    • Partitions in Hive and Impala
    • Dealing with Hive Partition Tables
    • Hands-on â?? Hive partition tables

 

  • Spark â?? High Level Entry to Development
    • Directed Acyclic Graph
    • Types of Spark CLI â?? Spark-shell and pyspark
    • Functional Programming in spark
    • Introduction to Spark RDD
    • Hands-on â?? Running Spark applications using spark cli

 

  • Spark RDDs
    • How RDDs are created from files or data in memory
    • Handling File Formats
    • Additional Operations on RDD
    • Hands-on Process Data Files using spark RDD

 

  • Pair RDDs and Aggregations
    • Key Value Pair RDD
    • Other Pair RDD Concepts
    • Pair RDD to join Datasets
    • Hands-on â?? Using Pair RDD to join Dataset in spark cli

 

 

 

  • Writing and Deploying Spark Applications
    •  How to write a spark application â?? Scala and Pyspark
    • Run Spark Appliations in YARN
    • Access Spark Application Web UI and controlling the applications
    • Configuring application properties and Loggings
    • Hands-On â?? Writing a spark applications â?? pyspark and Scala
    • Hands-on Configuring a spark applications

 

 

  • Parallel Processing in Spark
    • RDD partitions
    • Partitions of File-Based RDDs
    • HDFS and Data Locality
    • Executing parallel operations
    • Stages and Tasks
    • Hands-on â?? Viewing stages and jobs in spark applicationUI

 

  • RDD Persistence
    • RDD Lineage
    • RDD Persistence
    • Distributed persistence
    • Hands-on â?? How to Persist an RDD

 

  • Spark Data Processing patterns
    • Iterative Algorithms in Spark
    • Graph Processing and Analysis
    • Hands â?? on â?? Implementation of iterative Algorithm with Spark

 

 

 

  • Spark Dataframes
    • Introduction to Spark Data Frames
    • DataFrame API
    • Load Data to DataFrames
    • Converting DataFrames to pair RDD
    • Hands-on â?? Working with DataFrames

 

  • Spark SQLContext
    • Spark SQL Basics
    • Creating SparkSQL

 

 

 

    • Querying SparkSQL
    • Hands-on â?? Working with SparkSQL
  • Datasets
    • Typed API
    • Untyped API
    • Hands-on â?? Working with Datasets

 

  • Spark ML Libraries
    • Introduction to Machine Learning
    • Machine Learning with Spark
    • K-Means
    • Hands-on â?? Implementation of K-means and one another ML Use case
  • Spark Streaming
    • Spark Streaming overview
    • DStreams
    • Developing stream Applications
    • Multi Batch Operations
    • Time slicing and state operations
    • Sliding window Operations
    • Hands-Ons
  • Application Hands-on Related to Hadoop and Spark
  • Application Hands-on Related to Kafka and Spark Streaming
  • Conclusion

Gallery (1)

About the Trainer

Navaneetha Babu C picture

5 Avg Rating

5 Reviews

6 Students

2 Courses

Navaneetha Babu Chellathurai

Official Cloudera's Trainer

9 Years of Experience

Digital Big Data Transformation Expert | Technical Speaker

Contact Me for Big Data Consulting and Training Assignments - Not Full Time.

Cloudera Certified Apache Hadoop Instructor with 7 years of Overall and 6 Years of Big Data Experience with demonstrating history of working in the financial services industry.
Expertise in Big Data Transformation involved in Creating Big Data Lake using Hadoop for Cold Data Strorage and Cassandra Data Lake for Customer360 Hot Storage.
Expertise in Real-Time Data Ingestion and Data Processing using Kafka-Streaming, Spark Streaming and Lightbend
Evangelist in Big Data Engineering involving all the stacks starting from Installing to High performance tuning.
Skilled in Hadoop, Hive, Spark, Cassandra, NiFI, MongoDB, Spark, Talend BigData, R, YAML, Python, DataStage, Statistical Tools, IoT, Strong Data Warehousing, Spark based Analytics model development and Data Analytics background.
- Cloudera Certified Apache Hadoop Instructor
- Cloudera Trained and Certified Apache Hadoop Administrator
- Cloudera Trained and Certified Apache Hadoop Developer
- Cloudera Certified Apache Hadoop Data Analyst
- Mongo University Certified MongoDB DBA

Conducted 50 plus Corporate Cloudera University Batches(Admin,Data Analyst) and Other Cloudera Based Developer Batches
Conducted 15 Hortonworks Data Platform based Hadoop Developer and Admin Batches
5 Batches of Hortonworks Data Flow based Apache Kafka and Nifi based Real-Time Data Ingestion Batches
Demonstrated Cassandra Administrator and Developer Batch
Expert in Talend Big Data and its Integration with Big Data Component
Working on few University Undergraduate and Post Diploma Big Data Analytics Course structuring and Book preparation. Eventually into Universities based Big Data Tutoring and Research for Student Community

Contact me for Hadoop - Cloudera and Hortonworks Training and Consulting Requirements - Lets Connect and Happy Hadooping

Students also enrolled in these courses

LIVE
45 Hours

Course offered by Indumathi

2 reviews
LIVE
43 reviews
2 Hours

Course offered by Aakash Kumar

116 reviews
LIVE

Course offered by Arjun

0 review
LIVE
4 reviews

Course offered by Perpetro Technologies

24 reviews

Reviews (5)

5 out of 5 5 reviews

Navaneetha Babu Chellathurai https://s3-ap-southeast-1.amazonaws.com/tv-prod/member/photo/6730010-small.png Adyar
5.0055
Navaneetha Babu Chellathurai
D

Big Data

"Very interactive and presentations were interesting, good slides and videos that kept us all engaged.A real eye-opener for us all. I feel better equipped to manage after completing the course. He is enthusiastic and really ware of what he explaining. "

Navaneetha Babu Chellathurai
V

Big Data

"Navaneetha Babu is a bigdata and Hadoop legend. He is a real-time expert trainer."

Navaneetha Babu Chellathurai
V

Big Data

"Excellent teaching, Learning materials provided was simply amazing. Hands on task given was typical. Worth every hours. "

Navaneetha Babu Chellathurai
A

Big Data

"I was looking for career change in Big Data Technologies, one of my friend referred Navaneeth's in-class course on Big Data. The course was more practical and interactive, after successful completion of course I gained more knowledge on Big Data and it helped me to crack couple of interviews. "

View All
Have you attended any class with Mazelon Infotech? Write a Review

Tutor has not setup batch timings yet. Book a Demo to talk to the Tutor.

Different batches available for this Course

5 out of 5 5 reviews

Navaneetha Babu Chellathurai https://s3-ap-southeast-1.amazonaws.com/tv-prod/member/photo/6730010-small.png Adyar
5.0055
Navaneetha Babu Chellathurai
D

Big Data

"Very interactive and presentations were interesting, good slides and videos that kept us all engaged.A real eye-opener for us all. I feel better equipped to manage after completing the course. He is enthusiastic and really ware of what he explaining. "

Navaneetha Babu Chellathurai
V

Big Data

"Navaneetha Babu is a bigdata and Hadoop legend. He is a real-time expert trainer."

Navaneetha Babu Chellathurai
V

Big Data

"Excellent teaching, Learning materials provided was simply amazing. Hands on task given was typical. Worth every hours. "

Navaneetha Babu Chellathurai
A

Big Data

"I was looking for career change in Big Data Technologies, one of my friend referred Navaneeth's in-class course on Big Data. The course was more practical and interactive, after successful completion of course I gained more knowledge on Big Data and it helped me to crack couple of interviews. "

Navaneetha Babu Chellathurai
T

Big Data

"Great explanation for all big data concepts. Explained all core concepts with real time examples. Interesting hands-on problems. Very much useful to take my career to the next level. "

Have you attended any class with Mazelon Infotech? Write a Review

Reply to 's review

Enter your reply*

1500/1500

Please enter your reply

Your reply should contain a minimum of 10 characters

Your reply has been successfully submitted.

Certified

The Certified badge indicates that the Tutor has received good amount of positive feedback from Students.

Different batches available for this Course

tickYou have successfully registered

Apache Spark Development with Kafka, Spark Streaming, SparkML and Real-Time Applications by Navaneetha Babu Chellathurai

Navaneetha Babu C picture
LIVE
(5 reviews)

Class
starts in

01

Days

01

Hour

01

Min

01

Sec

Select One

Register Now

Do you want to Register for this Free class?

Yes, Register No, not right now

Tell us a little more about yourself

Apache Spark Development with Kafka, Spark Streaming, SparkML and Real-Time Applications by Navaneetha Babu Chellathurai

Navaneetha Babu C picture
LIVE
(5 reviews)

Class
starts in

01

Days

01

Hour

01

Min

01

Sec

Please enter Student name

Please enter your email address.

Please enter phone number.

Verify Your Mobile Number

Please verify your Mobile Number to book this free class.

Update

Please enter 10 digit phone number.

Please enter your phone number.

Please Enter a valid Mobile Number

This number is already in use.

Resend

Please enter OTP.

Or, give a missed call and get your number verified

080-66-0844-42

This website uses cookies

We use cookies to improve user experience. Choose what cookies you allow us to use. You can read more about our Cookie Policy in our Privacy Policy

Accept All
Decline All

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 55 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 7.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more