Hadoop

Hadoop

Trending Questions and Lessons

Follow 7,444 Followers

Ask a Question

Feed

All

All

Lessons

Discussion

Answered on 03 Jun IT Courses/Hadoop

Career Bridge IT Services

Yes. Career bridge it services, one of the best training insitute in hyderabad. we provide lady trainer for ofline / online batches. Please call and contact @970-532-3377. So that you can get all the details about trainings and career guiidance.
Answers 11 Comments
Dislike Bookmark

Answered on 12 Apr IT Courses/Hadoop

Biswanath Banerjee

Tutor

Hi Suresh, I am providing hadoop administration training which will lead you to clear the Cloudera Administrator Certification exam (CCA131). You can contact me for course details. Regards Biswanath
Answers 10 Comments
Dislike Bookmark

Lesson Posted on 09 Apr IT Courses/Hadoop

How can you recover from a NameNode failure in Hadoop cluster?

Biswanath Banerjee

We have trained more than 1000 students on Big Data Technologies - Hadoop ecosystem, Apache Spark, Tableau,...

How can you recover from a Namenode failure in Hadoop?Why is Namenode so important?Namenode is the most important Hadoop service. It contains the location of all blocks in the cluster. It maintains the state of the distributed file system.We have something called a secondary name node. Secondary Namenode... read more

How can you recover from a Namenode failure in Hadoop?

Why is Namenode so important?

Namenode is the most important Hadoop service. It contains the location of all blocks in the cluster. It maintains the state of the distributed file system.We have something called a secondary name node. Secondary Namenode is not a back up for the name node. When a name node fails, it is possible to recover from a previous checkpoint generated by Secondary Namenode.Secondary Namenode performs periodic checkpoint process.

How to recover a failed Namenode?

We faced with a situation where the node hosting the Namenode service has failed. The secondary Namenode is running on some other separate machine. In the core-default.XML , the fs.checkpoint. Dir property has been set previously. This property tells the Secondary Namenode where to save the checkpoints on the local file system.

Carry out the following steps to recover from a NameNode failure:

    1. Stop the Secondary NameNode:

$ cd /path/to/Hadoop

$ bin/hadoop-daemon.sh stop secondarynamenode

    2. Bring up a new machine to act as the new NameNode. This machine should have Hadoop installed, be configured like the previous NameNode, and ssh password-less login should be configured. Also, it should have the same IP and hostname as the previous NameNode.
   

    3. Copy the contents of fs.checkpoint.dir on the Secondary NameNode to the pdfs.name. Dir folder on the new NameNode machine.
   

    4. Start the new NameNode on the new machine:

$ bin/hadoop-daemon.sh start namenode

    5. Start the Secondary NameNode on the Secondary NameNode machine:

$ bin/hadoop-daemon.sh start secondarynamenode

    6. Verify that the NameNode started successfully by looking at the NameNode status page http://localhost:50070/

The working: -

We first log in to the Secondary Namenode to stop its service. Next, we set up the Namenode in a new machine.

Next, we copy all the checkpoints and editing files from the Secondary Namenode to the new Namenode. In this way, we recover the filesystem status, metadata and editions at the time of the last checkpoint. Finally, we restarted the new Namenode and Secondary Namenode.

read less
Comments
Dislike Bookmark

Looking for Hadoop Classes

Find best Hadoop Classes in your locality on UrbanPro.

FIND NOW

Lesson Posted on 04 Apr IT Courses/Hadoop

Solving the issue of Namenode not starting during Single Node Hadoop installation

Biswanath Banerjee

We have trained more than 1000 students on Big Data Technologies - Hadoop ecosystem, Apache Spark, Tableau,...

On firing jps command, if you see that name node is not running during single node hadoop installation , then here are the steps to get Name Node running Problem: namenode not getting started Solution: - Please follow the following steps STEP 1: stop hadoop hduser@skillmentorz-virtualbox : $ /usr/local/hadoop/sbin/stop-dfs.sh ... read more

On firing jps command, if you see that name node is not running during single node hadoop installation , then here are the steps to get Name Node running

Problem: namenode not getting started

Solution: - Please follow the following steps

STEP 1: stop hadoop

 

hduser@skillmentorz-virtualbox : $ /usr/local/hadoop/sbin/stop-dfs.sh

 

STEP 2: remove tmp folder

 

hduser@skillmentorz-virtualbox :$ sudo rm -rf /app/hadoop/tmp/

 

STEP 3:  create /app/hadoop/tmp/

 

hduser@skillmentorz-virtualbox : $ sudo mkdir -p /app/hadoop/tmp

hduser@skillmentorz-virtualbox : $ sudo chown hduser:hadoop /app/hadoop/tmp

hduser@skillmentorz-virtualbox : $ sudo chmod 750 /app/hadoop/tmp

 

STEP 4:  format namenode

 

hduser@skillmentorz-virtualbox : $ hdfs namenode -format

 

STEP 5 : start dfs

 

hduser@skillmentorz-virtualbox : $ /usr/local/hadoop/sbin/start-dfs.sh

STEP 6:  check jps

hduser@skillmentorz-virtualbox : $ jps

You will find all the daemons including Namenode running properly

Namenode

Secondary Namenode

Resource Manager

Data node

Node Manager

 

Thank you for your time.

read less
Comments
Dislike Bookmark

Lesson Posted on 27 Feb IT Courses/Hadoop

Hadoop Development Syllabus

Mind Scripts Technologies

MindScripts is a leading IT training institute in Pune which has been successful in providing IT training...

Hadoop 2 Development with Spark Big Data Introduction: What is Big Data Evolution of Big Data Benefits of Big Data Operational vs Analytical Big Data Need for Big Data Analytics Big Data Challenges Hadoop cluster: Master Nodes Name Node Secondary Name Node Job Tracker Client... read more
Hadoop 2 Development with Spark
Big Data Introduction:
  • What is Big Data
  • Evolution of Big Data
  • Benefits of Big Data
  • Operational vs Analytical Big Data
  • Need for Big Data Analytics
  • Big Data Challenges
Hadoop cluster:
  • Master Nodes
    • Name Node
    • Secondary Name Node
    • Job Tracker
  • Client Nodes
  • Slaves
  • Hadoop configuration
  • Setting up a Hadoop cluster
HDFS:
  • Introduction to HDFS
  • HDFS Features
  • HDFS Architecture
  • Blocks
  • Goals of HDFS
  • The Name node & Data Node
  • Secondary Name node
  • The Job Tracker
  • The Process of a File Read
  • How does a File Write work
  • Data Replication
  • Rack Awareness
  • HDFS Federation
  • Configuring HDFS
  • HDFS Web Interface
  • Fault tolerance
  • Name node failure management
  • Access HDFS from Java
Yarn
  • Introduction to Yarn
  • Why Yarn
  • Classic MapReduce v/s Yarn
  • Advantages of Yarn
  • Yarn Architecture
    • Resource Manager
    • Node Manager
    • Application Master
  • Application submission in YARN
  • Node Manager containers
  • Resource Manager components
  • Yarn applications
  • Scheduling in Yarn
    • Fair Scheduler
    • Capacity Scheduler
  • Fault tolerance
MapReduce:
  • What is MapReduce
  • Why MapReduce
  • How MapReduce works
  • Difference between Hadoop 1 & Hadoop 2
  • Identity mapper & reducer
  • Data flow in MapReduce
  • Input Splits
  • Relation Between Input Splits and HDFS Blocks
  • Flow of Job Submission in MapReduce
  • Job submission & Monitoring
  • MapReduce algorithms
    • Sorting
    • Searching
    • Indexing
    • TF-IDF
Hadoop Fundamentals:
  • What is Hadoop
  • History of Hadoop
  • Hadoop Architecture
  • Hadoop Ecosystem Components
  • How does Hadoop work
  • Why Hadoop & Big Data
  • Hadoop Cluster introduction
  • Cluster Modes
    • Standalone
    • Pseudo-distributed
    • Fully - distributed
  • HDFS Overview
  • Introduction to MapReduce
  • Hadoop in demand
HDFS Operations:
  • Starting HDFS
  • Listing files in HDFS
  • Writing a file into HDFS
  • Reading data from HDFS
  • Shutting down HDFS
HDFS Command Reference:
  • Listing contents of directory
  • Displaying and printing disk usage
  • Moving files & directories
  • Copying files and directories
  • Displaying file contents
Java Overview For Hadoop:
  • Object oriented concepts
  • Variables and Data types
  • Static data type
  • Primitive data types
  • Objects & Classes
  • Java Operators
  • Method and its types
  • Constructors
  • Conditional statements
  • Looping in Java
  • Access Modifiers
  • Inheritance
  • Polymorphism
  • Method overloading & overriding
  • Interfaces
MapReduce Programming:
  • Hadoop data types
  • The Mapper Class
    • Map method
  • The Reducer Class
    • Shuffle Phase
    • Sort Phase
    • Secondary Sort
    •  Reduce Phase
  • The Job class
    • Job class constructor
  • JobContext interface
  • Combiner Class
    • How Combiner works
    • Record Reader
    • Map Phase
    • Combiner Phase
    • Reducer Phase
    • Record Writer
  • Partitioners
    • Input Data
    • Map Tasks
    • Partitioner Task
    • Reduce Task
    • Compilation & Execution

 
Hadoop Ecosystems
Pig:
  • What is Apache Pig?
  • Why Apache Pig?
  • Pig features
  • Where should Pig be used
  • Where not to use Pig
  • The Pig Architecture
  • Pig components
  • Pig v/s MapReduce
  • Pig v/s SQL
  • Pig v/s Hive
  • Pig Installation
  • Pig Execution Modes & Mechanisms
  • Grunt Shell Commands
  • Pig Latin - Data Model
  • Pig Latin Statements
  • Pig data types
  • Pig Latin operators
  • CaseSensitivity
  • Grouping & Co Grouping in Pig Latin
  • Sorting & Filtering
  • Joins in Pig latin
  • Built-in Function
  • Writing UDFs
  • Macros in Pig
HBase:
  • What is HBase
  • History Of HBase
  • The NoSQL Scenario
  • HBase & HDFS
  • Physical Storage
  • HBase v/s RDBMS
  • Features of HBase
  • HBase Data model
  • Master server
  • Region servers & Regions
  • HBase Shell
  • Create table and column family
  • The HBase Client API
Spark:
  • Introduction to Apache Spark
  • Features of Spark
  • Spark built on Hadoop
  • Components of Spark
  • Resilient Distributed Datasets
  • Data Sharing using Spark RDD
  • Iterative Operations on Spark RDD
  • Interactive Operations on Spark RDD
  • Spark shell
  • RDD transformations
  • Actions
  • Programming with RDD
    • Start Shell
    • Create RDD
    • Execute Transformations
    • Caching Transformations
    • Applying Action
    • Checking output
  • GraphX overview
Impala:
  • Introducing Cloudera Impala
  • Impala Benefits
  • Features of Impala
  • Relational databases vs Impala
  • How Impala works
  • Architecture of Impala
  • Components of the Impala
    • The Impala Daemon
    • The Impala Statestore
    • The Impala Catalog Service
  • Query Processing Interfaces
  • Impala Shell Command Reference
  • Impala Data Types
  • Creating & deleting databases and tables
  • Inserting & overwriting table data
  • Record Fetching and ordering
  • Grouping records
  • Using the Union clause
  • Working of Impala with Hive
  • Impala v/s Hive v/s HBase
MongoDB Overview:
  • Introduction to MongoDB
  • MongoDB v/s RDBMS
  • Why & Where to use MongoDB
  • Databases & Collections
  • Inserting & querying documents
  • Schema Design
  • CRUD Operations
Oozie & Hue Overview:
  • Introduction to Apache Oozie
  • Oozie Workflow
  • Oozie Coordinators
  • Property File
  • Oozie Bundle system
  • CLI and extensions
  • Overview of Hue
Hive:
  • What is Hive?
  • Features of Hive
  • The Hive Architecture
  • Components of Hive
  • Installation & configuration
  • Primitive types
  • Complex types
  • Built in functions
  • Hive UDFs
  • Views & Indexes
  • Hive Data Models
  • Hive vs Pig
  • Co-groups
  • Importing data
  • Hive DDL statements
  • Hive Query Language
  • Data types & Operators
  • Type conversions
  • Joins
  • Sorting & controlling data flow
  • local vs mapreduce mode
  • Partitions
  • Buckets
Sqoop:
  • Introducing Sqoop
  • Scoop installation
  • Working of Sqoop
  • Understanding connectors
  • Importing data from MySQL to Hadoop HDFS
  • Selective imports
  • Importing data to Hive
  • Importing to Hbase
  • Exporting data to MySQL from Hadoop
  • Controlling import process
Flume:
  • What is Flume?
  • Applications of Flume
  • Advantages of Flume
  • Flume architecture
  • Data flow in Flume
  • Flume features
  • Flume Event
  • Flume Agent
    •  Sources
    •  Channels
    •  Sinks
  • Log Data in Flume
Zookeeper Overview:
  • Zookeeper Introduction
  • Distributed Application
  • Benefits of Distributed Applications
  • Why use Zookeeper
  • Zookeeper Architecture
  • Hierarchial Namespace
  • Znodes
  • Stat structure of a Znode
  • Electing a leader
Kafka Basics:
  • Messaging Systems
    • Point-to-Point
    • Publish - Subscribe
  • What is Kafka
  • Kafka Benefits
  • Kafka Topics & Logs
  • Partitions in Kafka
  • Brokers
  • Producers & Consumers
  • What are Followers
  • Kafka Cluster Architecture
  • Kafka as a Pub-Sub Messaging
  • Kafka as a Queue Messaging
  • Role of Zookeeper
  • Basic Kafka Operations
    • Creating a Kafka Topic
    • Listing out topics
    • Starting Producer
    • Starting Consumer
    • Modifying a Topic
    • Deleting a Topic
  • Integration With Spark
Scala Basics:
  • Introduction to Scala
  • Spark & Scala interdependence
  • Objects & Classes
  • Class definition in Scala
  • Creating Objects
  • Scala Traits
  • Basic Data Types
  • Operators in Scala
  • Control structures
  • Fields in Scala
  • Functions in Scala
  • Collections in Scala
    • Mutable collection
    • Immutable collection
Project
Project name: Analysis Report on Product Sales data using HADOOP
Project description: Product analysis using HADOOP will provide an efficient way of analyzing data using HDFS and Map Reduce fundamentals. The data can be utilized in several analysis. HADOOP allows user to process large amount of such data. There could be several complex use cases which can easily answered by PIG and HIVE and other eco-systems.
read less
Comments
Dislike Bookmark

Looking for Hadoop Classes

Find best Hadoop Classes in your locality on UrbanPro.

FIND NOW

Answered on 20 Feb IT Courses/Hadoop

Prasuna

Hadoop Trainer

It is always better to take the help of an instructor. Go to an institue or take an online course. When you decide to learn Hadoop. You need to start with understanding Hadoop's architecture and its abilities, practical use cases, HDFS architecture and MR programming model. Then you need to atleast have... read more

It is always better to take the help of an instructor. Go to an institue or take an online course.

When you decide to learn Hadoop. You need to start with understanding Hadoop's architecture and its abilities, practical use cases, HDFS architecture and MR programming model. Then you need to atleast have a basic understanding of each component in Hadoop's eco system like Hive, Pig, Hbase, Sqoop, Flume, Kafka, Spark, Scala, python etc. Then you can concentrante more on the trending spark, scala and continue with that. You can always reach me for more information.

read less
Answers 6 Comments
Dislike Bookmark

Answered on 20 Feb IT Courses/Hadoop

Prasuna

Hadoop Trainer

You can also try some projects uploaded in Github. Once, you are confident, you can work on the ones posted in kaggle.
Answers 3 Comments
Dislike Bookmark

Answered on 01 Feb IT Courses/Hadoop

Which are the best institutions in Pune to get Hadoop and big data training and placement? I have 3 years... read more
Which are the best institutions in Pune to get Hadoop and big data training and placement? I have 3 years of experience as a Java developer. read less

Suraz

You can now get 50 hours of training absoultely free. 20 Hours of recorded sessions followed by 30 hours of live session.
Answers 2 Comments
Dislike Bookmark

Looking for Hadoop Classes

Find best Hadoop Classes in your locality on UrbanPro.

FIND NOW

Answered on 21/12/2017 IT Courses/Hadoop

What is the right way to approach a job as a data scientist fresher? I am a PhD, currently training myself... read more
What is the right way to approach a job as a data scientist fresher? I am a PhD, currently training myself in R, Python, Hadoop, Spark and ML. read less

Vijay

Best Hadoop And Datascience Training

Hi Kumar, The best way is to take training in Machine learning and Deep Learning also learn the live scenarios and parallely attend interviews.
Answers 1 Comments
Dislike Bookmark

About UrbanPro

UrbanPro.com helps you to connect with the best Hadoop Classes in India. Post Your Requirement today and get connected.

Overview

Questions 544

Lessons 24

Total Shares  

Top Contributors

Connect with Expert Tutors & Institutes for Hadoop

x

Ask a Question

Please enter your Question

Please select a Tag

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 25 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 6.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more