Signup as a Tutor

As a tutor you can connect with more than a million students and grow your network.


PySpark - Spark with Python

No Reviews Yet

New Town, Kolkata

Course ID: 369768

New Town, Kolkata

Students Interested 0 (Seats Left 0)

₹ 19,999

No Reviews Yet

About the Course

This course will teach you how to implement distributed data management and machine learning in Spark using the PySpark package.

This course is for software developers, software architects, Hadoop developers, Cloud computing develoeprs, Data Analysts, Business Analysts, Project Managers who wants to learn spark in a python environment to analyze distributed data across a cluster.

The pre-requisite for this course is a fair amount of knowledge on Python. If a student not having knowledge on Python , he can take up python course along with Pyspark course.

What you will learn in this PySpark course :-

Introduction to Spark

  • What is Apache Spark?
  • Spark Jobs and APIs
  • Execution process
  • Resilient Distributed Dataset
  • DataFrames
  • Datasets
  • Catalyst Optimizer
  • Spark 2.0 architecture
  • Unifying Datasets and DataFrames
  • Introducing SparkSession
  • Structured streaming
  • Continuous applications



Step-by-Step installation of PySpark


Resilient Distributed Datasets

  • Internal workings of an RDD
  • Creating RDDs
  • Schema
  • Reading from files
  • Lambda expressions
  • Global versus local scope
  • Transformations
  • The .map(...) transformation
  • The .filter(...) transformation
  • The .flatMap(...) transformation
  • The .distinct(...) transformation
  • The .sample(...) transformation
  • The .leftOuterJoin(...) transformation
  • The .repartition(...) transformation
  • Actions
  • The .take(...) method
  • The .collect(...) method
  • The .reduce(...) method
  • The .count(...) method
  • The .saveAsTextFile(...) method
  • The .foreach(...) method



  • Introduction to DataFrames
  • Speeding up PySpark with DataFrames
  • Spark DataFrame Basics
  • Spark DataFrame Basic Operations
  • DataFrame API Query
  • SQL: Query
  • Group By and Aggregate Operations
  • Missing Data
  • Dates and Timestamps
  • Interoperating with RDDs
  • Querying with SQL
  • DataFrame Project


Spark Machine Learning

    Introduction to Machine Learning

  • Machine Learning with Spark and Python with MLib
  • Linear Regression
  • Regression Evaluation
  • Linear Regression Project
  • Logistic Regression – Introduction
  • Logistic Regression Example with code
  • Logistic Regression Project
  • Decision Tress and Random Forest – Introduction
  • Tree methods Theory
  • Example with code – Decision Tree and Random Forest
  • Random Forest Classification Project
  • K-Means Clustering –Introduction
  • K-Means Clustering Theory
  • Example with code – Clustering
  • Clustering Project
  • Introduction to Recommender System
  • Recommender System – Example with code


Spark Streaming

Introduction to Spark Streaming

  • Spark Streaming Examples
  • Analyzing web logs published with Flume using Spark Streaming
  • Monitor Flume-published logs for errors in real-time
  • Aggregating HTTP access codes with Spark streaming
  • Spark Streaming Project


Spark SQL

Exploring data using Spark SQL

  • Understanding Spark dataframes
  • Understanding the Spark SQL query optimizer
  • Loading and processing CSV files with Spark SQL
  • Querying MongoDB from Spark SQL


Project Hands On

 At the end of the course you will be doing a real world project on PySpark on Azure Cloud.

Date and Time

Not decided yet.

About the Trainer

Avg Rating

0 Reviews

0 Students

3 Courses



-18 years of total Industry experience
-Worked in architecting big data solutions to major Fortune 500 companies around the globe
-Thorough understanding of the Hadoop components and the design pipeline
-Trained more than 500 students
- Provides training to major Corporates
-Consultant to companies in UK and UAE based customers on big data
- Practical, hands-on teaching methodology
- Prepares the students for job interviews through mock interview sessions
- Projects are a major part of the course curriculum to get the students oriented to the actual working environment
- 96% success rate .. most of the students taught by this trainer gets well placed in IT MNCs


No reviews currently Be the First to Review



No Reviews Yet

Students Interested 0 (Seats Left 0)

₹ 19,999

Post your requirement and let us connect you with best possible matches for Big Data Training Post your requirement now


Submit your enquiry for PySpark - Spark with Python

Please enter valid question or comment

Please enter your name.

Please enter valid Phone Number

Please enter the Pin Code.

Please check the fields again.

By submitting, you agree to our Terms of use and Privacy Policy

Connect With Sanhita Big Data Classes

You have reached a limit!

We only allow 20 Tutor contacts under a category. Please send us an email at support@urbanpro.com for contacting more Tutors.

You Already have an UrbanPro Account

Please Login to continue

Please Enter valid Email or Phone Number

Please Enter your Password

Please enter the OTP sent to your registered mobile number.

Please Enter valid Password or OTP

Forgot Password? Resend OTP OTP Sent

Sorry, we were not able to find a user with that username and password.

We have sent you an OTP to your register email address and registered number. Please enter OTP as Password to continue

Further Information Received

Thank you for providing more information about your requirement. You will hear back soon from the trainer

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 55 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 7.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more