Find the best tutors and institutes for Big Data

Find Best Big Data Training

Please select a Category.

Please select a Locality.

No matching category found.

No matching Locality found.

Outside India?

Search for topics

Big Data Updates

Ask a Question

Post a Lesson

All

All

Lessons

Discussion

Lesson Posted on 14 Sep IT Courses/Infor ERP IT Courses/Big Data

training #bigdatalab #online

JOSHUA CHARLES

I have 6+ years of IT experience and involved responsibilities such as Production/Application Support...

# Fully equiped bigdata lab , for training and practice .Users can practice bigdata, datascience and machine learning technologies . User Can access this through internet , learn from anywhere. Kindly contact me for activation and subscription read more

# Fully equiped bigdata lab , for training and practice .Users can practice bigdata, datascience and machine learning technologies . User Can access this through internet , learn from anywhere. Kindly contact me for activation and subscription

read less
Comments
Dislike Bookmark

Answered on 06 Jul IT Courses/Big Data

Rajasekhar Lekkala

Tutor

Basically Spark is a framework. To work on Spark, you need to be familiar (middle level) with Java/Scala/python
Answers 11 Comments
Dislike Bookmark

Looking for Big Data Training

Find best Big Data Training in your locality on UrbanPro.

FIND NOW

Lesson Posted on 29 Mar IT Courses/Big Data IT Courses/Hadoop IT Courses/Big Data/Big Data Testing +2 Tuition/BTech Tuition/Big Data Analytics IT Courses/Hadoop/Hadoop Testing less

CheckPointing Process - Hadoop

Silvia Priya

Experienced datawarehouse professional for 7.5 years. Certified Big data-Hadoop and Python Trainer. I...

Check pointing process is one of the important concept/activity under Hadoop. The Name node stores the metadata information in it's hard disk. We all know that metadata is the heart core of the distributed file system, it if is lost we cannot access any files inside the file system. ... read more

 

 

 

 

 

 

Check pointing process is one of the important concept/activity under Hadoop. The Name node stores the metadata information in it's hard disk.

We all know that metadata is the heart core of the distributed file system, it if is lost we cannot access any files inside the file system.

 

The metadata physically gets stored in the machine in the form of two files

1. FSIMAGE - Snapshot of the file system at a point of time

2. EDITS FILE - Contains every transaction (creation,deletion,moving,renaming,copying ..etc of files)  in the file system.

Based on HA(High Availability) in Hadoop V2, the backup of the NN's metadata will be stored in a another machine called SNN(StandBy Name Node). Since metadata is very frequently accessed by different clients for reading different files, instead of storing it in the hard disk, it is good to store it in the RAM, so that it can be accessed faster.

But Stop...What happens if the machine goes down.. :(. We will loose everything in the RAM. Hence taking a backup of the data stored in the RAM is a viable option.

 

 

FSIMAGE0 -- Represents the fsimage file at a particular time

FSIMAGE1 -- Represents the copy of the FSIMAGE0 file, taken as a backup.

Let's imagine the backup of the file has to be taken for every 6 hours, if suppose something goes wrong in the cluster and the machine gets down before taking the backup i.e before 6 hours, then we end up in losing the latest fsimage file.

 

So to overcome this problem, a new system has to be exclusively added  in the cluster for doing the process of safeguarding the metadata in a efficient way and that process is called CheckPointing Process.

 

 

Have a look on the picture and let's understand the process step by step.

STEP 1

A copy of the Metadata(Fsimage and Edits file) from NameNode will be taken and placed inside the Secondary name node(SNN).

STEP 2

Once the copy is placed in SNN , the Edits file which captures every single transaction happening in the file system will be merged with the Fsimage file (Snap shot of the filesystem). The merged result will give the updated or latest file system.

STEP 3

The latest merged Fsimage will be moved to the NN's metadata location.

STEP 4

During the process of merging also, some of the files may be deleted or created or copied basically some transactions could have happened and those details will be stored in a new file called Edits.new , because the original Edits file is been opened/utilized for copying into the SNN, remember the deadlock principle.

STEP 5

Now the Edits.new file will become the latest Edits file and the Merged fsimage will become the original fsimage file. This process will be continued for a specific interval.

So, now no more backup's are needed to save the metadata in NN in case of failover scenarios.

Will see more details and programs in the upcoming lessons.

Thank you!!

read less
Comments
Dislike Bookmark

Lesson Posted on 29 Mar IT Courses/Big Data IT Courses/Big Data/Big Data Testing Tuition/BTech Tuition/Big Data Analytics

Big Data for Beginners

Silvia Priya

Experienced datawarehouse professional for 7.5 years. Certified Big data-Hadoop and Python Trainer. I...

Hello Big Data Enthusiast, Many of you would have heard about this term "Big Data" getting buzzed out everywhere and wondering what it could be. Ok, let's sort out things with an example. Imagine you have a machine with a capacity of 8 GB storage, and you want to store a data of size 12 GB from a... read more

Hello Big Data Enthusiast,

Many of you would have heard about this term "Big Data" getting buzzed out everywhere and wondering what it could be.

Ok, let's sort out things with an example.

Imagine you have a machine with a capacity of  8 GB storage, and you want to store a data of size 12 GB from a client and perform some analytics on it. So, think of the possible ways in which you can  store the desired data.

1. Extend your HD capacity to around 15GB or beyond for a succesfull storage.

2. Hire a cloud serivce and upload the data in cloud for analysis, but if the client don't want to upload the data into cloud due to multiple reasons, then this option will be ruled out.

3. Upload the data into a distributed file system, after analysing it pro's and con's.

True, you can follow any one of the above mentioned cases.

This data of 12GB size which is beyond the machine storage capacity is actually called as BIG DATA.

This BIG DATA can be in any format/type like structured,unstructured and semistructured.

Structured - RDBMS data(table data with proper rows,columns,keys etc..)

Unstructured - Images, Pictures ,Videos etc.

Semistructured - Files of format HTML,XML etc.

 

A data can be big data to you and not necessarily be a big data to other person.

Seems confusing..., giong back to the previous example, if you have have machine of 20 GB storage capacity, you can conveniently store our 12GB and it is not a big data for you at first place.

 

Hope now have you climbed up a little, on the mountain of Big data!!..

Big data can also be defined in other way if it satisifies the below criterias.

 

 

Volume  

If the size of the data you are planning to analyse, is much bigger than the capacity of your machine, then call it as a big data.

Velocity

If the rate or speed of the data entering into your machine increases exponentially with respect to time , then call it as a big data.

Variety

The data could be in any formats like Structured,Unstructured,Semi-structured as we have seen previously.

Veracity

The data that we are going to process can contain some uncertain information or incorrect data.

Value

The data should makes some sense to the business, that is we should be able to make some analysis out of the data.

 

Now cooking up all of the above information, you can keep your first step into learning Big data.

We will see more insights about this on our next lesson.

Thank you!!

read less
Comments
Dislike Bookmark

Answered on 21/12/2018 IT Courses/Hadoop IT Courses/Hadoop/Hadoop Testing IT Courses/Big Data

Is Big Data Hadoop good to go for freshers?

LEARN THE NEW

Yes! Big Data is trending course now and it is a good choice for career. You can learn BigData from our platform, We will provide the best online trainig for Big Data Hadoop. Please click the below link for more details https://www.learnthenew.com/course/hadoop-online-training/ Thank you. read more

Yes! Big Data is trending course now and it is a good choice for career. 

You can learn BigData from our platform, We will provide the best online trainig for Big Data Hadoop. 

Please click the below link for more details

https://www.learnthenew.com/course/hadoop-online-training/

Thank you. 

read less
Answers 2 Comments
Dislike Bookmark

Answered on 10/12/2018 IT Courses/Hadoop IT Courses/Hadoop/Hadoop Testing IT Courses/Big Data

What are the best solutions for dependency handling in different processes of ETL in a data lake when... read more
What are the best solutions for dependency handling in different processes of ETL in a data lake when getting data from different sources? read less

Browning B Boniface

Accounting Expert

It can sometimes be difficult to order transformation steps properly and ensure that each step is working properly. Aim for a solution that helps you build out the transformation piece easily (ideally without code) to reduce the time spent building pipelines and to more easily root out any potential... read more

It can sometimes be difficult to order transformation steps properly and ensure that each step is working properly. Aim for a solution that helps you build out the transformation piece easily (ideally without code) to reduce the time spent building pipelines and to more easily root out any potential hiccups.

 

read less
Answers 1 Comments
Dislike Bookmark

Answered on 23/12/2018 IT Courses/Hadoop IT Courses/Hadoop/Hadoop Testing IT Courses/Big Data +1 Tuition/BTech Tuition/Big Data Analytics less

What is Hive on Spark?

Hemanth Reddy

Hi Priyanka, Spark has many Tools like Spark-Sql,Mlib.. Spark-Sql is the Tool which will work as like hive in Spark. We just need to login to the Spark-Sql Console By using 'Spark-Sql'Command on Linux box where you can run the same sql. Spark-Sql is Much much faster than Hive so, we use to go with... read more

Hi Priyanka,

    Spark has many Tools like Spark-Sql,Mlib..

Spark-Sql is the Tool which will work as like hive in Spark. We just need to login to the Spark-Sql Console By using 'Spark-Sql'Command on Linux box where you can run the same sql. Spark-Sql is Much much faster than Hive so, we use to go with Spark-Sql rather than Hive. I think I have answered your question. Still you have any queries you can reach out to me directly.

 

 

 

read less
Answers 2 Comments
Dislike Bookmark

Answered on 23/12/2018 IT Courses/Hadoop IT Courses/Hadoop/Hadoop Testing IT Courses/Big Data +1 Tuition/BTech Tuition/Big Data Analytics less

How difficult is it to get a job on Big Data/Hadoop?

Sivaram

Nothing is difficult in Hadoop . Its very easy to learn.
Answers 1 Comments 1
Dislike Bookmark

Looking for Big Data Training

Find best Big Data Training in your locality on UrbanPro.

FIND NOW

Answered on 23/12/2018 IT Courses/Hadoop IT Courses/Hadoop/Hadoop Testing IT Courses/Big Data +1 Tuition/BTech Tuition/Big Data Analytics less

Will data flair be a good site to learn big data and Hadoop?

Hemanth Reddy

Data flair and Acadglud is also a good source to study Hadoop
Answers 1 Comments
Dislike Bookmark

About UrbanPro

UrbanPro.com helps you to connect with the best Big Data Training in India. Post Your Requirement today and get connected.

Overview

Questions 529

Lessons 60

Total Shares  

+ Follow 10,770 Followers

Top Contributors

Connect with Expert Tutors & Institutes for Big Data

x

Ask a Question

Please enter your Question

Please select a Tag

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 25 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 6.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more