Find the best tutors and institutes for Data Science

Find Data Science Tutors & Institutes

Please select a Category.

Please select a Locality.

No matching category found.

No matching Locality found.

Outside India?

Search for topics

Feed

Ask a Question

Post a Lesson

All

All

Lessons

Discussion

Lesson Posted on 02 Jul IT Courses/Data Science

Types of Data

PROTON INSTITUTE OF DATA SCIENCE

We are the pioneers in Online training and real-time project assistance services. We provide project...

The data, which is under our primary consideration, contains a series of observations and measurements, made various subjects, patients, objects or other entities of interest. They might comprise the results of applying a battery of cognitive tests to a sample of patients with Alzheimer's disease, the... read more

The data, which is under our primary consideration, contains a series of observations and measurements, made various subjects, patients, objects or other entities of interest. They might comprise the results of applying a battery of cognitive tests to a sample of patients with Alzheimer's disease, the taxonomic characteristics of bacteria or the relative proportions of several constituents of
different types of rock (or food), for example. One particular type of multivariate
data set involves the collection of repeated measures of the same characteristics over time. And in a situation that might be termed doubly multivariate, we might
indeed have a multidimensional set of features that are assessed at each of
several time points.

A typical multivariate data matrix, X, will have the form
X = ( ::: ::: : : : =~ l '
Xnl Xn2 Xnp
where the typical element, xii, is the value of the jth variable for the ith individual.
If there are several distinct groups of individuals, one of the xiis might be a
categorical variable with values of I, 2, etc. to distinguish these groups. The
number Of individuals under investigation is n, and the number of observations
taken on each of these n individuals is p. Table l.l gives a hypothetical example
of such a multivariate data matrix. Here n = 10, p = 7 and, for example,
X34 = 135.
In many cases, as in Table 1.1, the variables measured on each of them
individuals will be of different types depending on whether they are conveying

Types of data 3
Table 1.1 Data matrix for a hypothetical example of 10 individuals
Individual Gender Age (yrs) 10 Depression Health Weight (lbs)
1 Male 21 120 Yes Very good 150
2 Male 43 NK No Very good 160
3 Male 22 135 No Average 135
4 Male 86 150 No Very poor 140
5 Male 60 92 Yes Good 110
6 Female 16 130 Yes Good 110
7 Female NK 150 Yes Very good 120
8 Female 43 NK Yes Average 120
9 Female 22 84 No Average 105
10 Female 80 70 No Good 100
Note: NK =not known

Quantitative or merely qualitative information. The most common way of
distinguishing these types is the following:
• Nominal - unordered categorical variables. Examples include treatment
allocation, the gender of the respondent, hair colour, presence or absence of
Depression, and so on.
• Ordinal - where there is ordering but no implication of distance between
The different points of the scale. Examples include social class and self-perception
of health (each coded from I to V, say), and educational level
(no schooling, primary, secondary or tertiary education).
• Interval - where there are equal differences between successive points on the
Scale, but the position of zero is arbitrary. The classic example is the measurement
Of temperature using the Celsius or Fahrenheit scales. In some cases a
the variable such as a measure of depression, anxiety or intelligence, for example,
might be treated as if it were interval-scaled when this, in fact, might be
Difficult to justify. We take a practical approach to such problems
and frequently treat these variables as interval-scaled measures- but the readers
should always question whether this might be a sensible thing to do and
What implications a wrong decision might have.
• Ratio - the highest level of measurement, where one can investigate the
The relative magnitude of scores and their differences, where zero is in the fixed position. The perfect example is the absolute measure of
temperature (in Kelvin, for example) but other common ones include age (or any other time from a fixed event), weight and length.

The qualitative information in Table l. L could have been presented in terms
of numerical codes (as often would be the case in a multivariate data set) such
that Gender= l for males and gender= 2 for females, for example, or Health= 5 when
perfect and Health= l for very poor, and so on. But it is vital that both the
user and consumer of these data appreciate that the same numerical codes (l,
say) will convey utterly different information, depending on the scale of
measurement.
4 Multivariate data and multivariate statistics

A further feature of Table 1.1 is that it contains missing values (NK). Age
has not been recorded for individual number 7, and no IQ value is available
for individuals 2 and 8. Missing observations arise from a variety of reasons,
and it is essential to put some effort into discovering why the view is
missing. One explanation is that such an observation might not apply to that individual. In a taxonomic study, for example, in which the investigator
might wish to classify dinosaur fossils, 'wing length' might be an essential
variable. Dinosaurs without wings will have missing values for this
variable! In other cases the measurement might be missing by accident or
because the respondent either forgot or refused to provide the information.
Occasionally, one might be able to obtain the information from elsewhere or
to repeat the measurement and then replace the missing value with useful
information.

Missing values can cause problems for many of the methods of analysis
described in this text, mainly if there are a lot of them. Although there
are many ways of dealing with missing-data problems (both valid and invalid!),
these are, in general, beyond the scope of this text. One method with universal applicability, however, is to attribute ('estimate') the missing values
from a knowledge of the data that are not missing. Such imputation methods
range from the very simple (replace the missing value with the mean of the
values from subjects with non-missing data, for example) to the technically
complex (multiple imputations acknowledging the stochastic nature of the
data) and are briefly described in Appendix B. However, one should always
keep in mind that the imputed values are virtual measurements. We do not get something for anything! And if there is a substantial proportion of the
individuals with large amounts of missing data one should undoubtedly question
whether any form of statistical analysis is worth the bother.

read less
Comments
Dislike Bookmark

Lesson Posted on 02 Jul IT Courses/Data Science

Discrimination, classification and pattern recognition

PROTON INSTITUTE OF DATA SCIENCE

We are the pioneers in Online training and real-time project assistance services. We provide project...

The importance of classification in science has already been remarked upon inChapter 6, where techniques were described for examining multivariate data forthe presence of relatively distinct groups or clusters of observations. In thischapter a ·further aspect of the classification problem will... read more

The importance of classification in science has already been remarked upon in
Chapter 6, where techniques were described for examining multivariate data for
the presence of relatively distinct groups or clusters of observations. In this
chapter a ·further aspect of the classification problem will be discussed, in
which the groups know a priori, and the aim is to devise rules which can
allocate previously unknown objects or individuals into these groups in an
optimal fashion. In this situation the investigator has one set of multivariate
observations, the training sample, for which group membership is known with
certainty a priori, and a second set, the test sample, consisting of observations
· for which group membership is unknown and which have to be assigned to
one of the known groups as accurately as possible. The information used in
deriving a suitable allocation rule is the variable values of the training
sample. Areas, where this type of classification problem is of importance, are
numerous and include the following.
• Medical diagnosis. Here the variables describing each might be the
results of various clinical tests, and the groups could be collections of patients
known to have different diseases.
• Archaeology. Here the aim might be to decide from which ancient civilisation
a, pottery fragment originates, with the variables being particular measurements
on the artefacts.
• Speech recognition. Here the objects to be classified are usually waveforms,
and the variables a set of acoustical parameters extracted from the utterance
of a specific word by an individual.
An initial question that might be asked is, 'since the members of the training
sample can be classified with certainty, why not apply the procedure used for
their classification to the test sample?'. Reasons are not difficult to identify. In
A simple example 249
Medicine, for example, it might be possible to diagnose a particular condition
with certainty only as a result of a post-mortem examination. Clearly, for
patients still alive and in need of treatment, a different diagnostic rule is
required.
In statistics the type of classification question described in this preamble is
usually referred to as a discrimination or assignment problem.

read less
Comments
Dislike Bookmark

Looking for Data Science Classes

Find best Data Science Classes in your locality on UrbanPro.

FIND NOW

Answered on 03 Apr IT Courses/Data Science

Sniffer Search

Yes,we have sarted such program. The course is designed to make you expert in 4 month time(60 Hourse course+60 Hours project work) 1)Machine Learning 2) Deep learning ,NLP and Speech to text with expert knowledge building in six type of neural network 3) Product Development(building real time A.I powered... read more

Yes,we have sarted such program.

The course is designed to make you expert in 4 month time(60 Hourse course+60 Hours project work)

1)Machine Learning

2) Deep learning ,NLP and Speech to text with expert knowledge building in six type of neural network

3) Product Development(building real time A.I powered Chatbot to automate recruitment process)

read less
Answers 13 Comments
Dislike Bookmark

Answered on 28 Feb IT Courses/Data Science

Sujan Edudemy Gowda

Chief Trainer

Easiest way to get started is with simlpe tools like excel and regression. Doesn't require programming language, basic maths and statistics would suffice to get the grasp at beginner level. Next, more advanced tools like SPSS, Tableau, linear algebra techniques and languages like R, Python could be learnt... read more

Easiest way to get started is with simlpe tools like excel and regression. Doesn't require programming language, basic maths and statistics would suffice to get the grasp at beginner level. Next, more advanced tools like SPSS, Tableau, linear algebra techniques and languages like R, Python could be learnt at more advanced stage. To summarise, start with analytics with excel course. Then move onto data science and engineering, as stated above. Hope this helps. 

Feel free to contact for more clarity.

read less
Answers 19 Comments
Dislike Bookmark

Lesson Posted on 24 Feb IT Courses/Data Science

Data Science: Case Studies

Reachout Analytics Pvt Ltd

I have 20 years strong experience & Alumni of MBA-IIMC SQC-ISI Building Statistical Predictive models...

Modules Training Practice Case Studies Module 2: Data Visualization and Summarization 10 15 1. Crime Data 2. Depression & anxiety 3. Customer Demographic Data Module 3: Data Preparation and Quality Check 5 10 3.... read more

Modules

Training

Practice

Case Studies

Module 2: Data Visualization and Summarization

10

15

1. Crime Data

2. Depression & anxiety

3. Customer Demographic Data

Module 3: Data Preparation and Quality Check

5

10

3. Sales Target Fixing

4. Hyper Market Data

5. Customer Attitude

6. Continuation of Depression & anxiety

Module 4: Predictive & Estimation Models (Supervised earning)

15

20

7 .Samsung Dubai Sales Modeling

8. Credit Card Fraudulent

9. Cancer Prediction

Module 5: Advanced Big Data Analytics

10

10

10. Hadoop Data Clustering

11. Speed Dating  modeling

12. E-commerce Customer Sentiment  Analysis

Module 6: Data Mining (Machine Learning)

5

15

Above  6, 7, 8, 9, 10, 11, 12 Case studies repeat in Machine learning  tools

Capstone project 

0

30

Data to Decision Making

Total 

45 hours

100 hours

 

 

read less
Comments
Dislike Bookmark

Looking for Data Science Classes

Find best Data Science Classes in your locality on UrbanPro.

FIND NOW

Lesson Posted on 22/12/2017 IT Courses/Data Science IT Courses/Machine Learning IT Courses/R Programming

Decision Tree or Linear Model For Solving A Business Problem

Ashish R.

SAS certified analytics professionals, more than 11 years of industrial and 12 years of teaching experience....

When do we use linear models and when do we use tree based classification models? This is common question often been asked in data science job interview. Here are some points to remember: We can use any algorithm. It is purely depends on the type of business problem we are solving and who is end user... read more

When do we use linear models and when do we use tree based classification models? This is common question often been asked in data science job interview. Here are some points to remember:

We can use any algorithm. It is purely depends on the type of business problem we are solving and who is end user of the model and how he is going to consume the model’s output. Let’s look at some key factors which will help you to decide which model to use:

  1. If the relationship between dependent & independent variable is well approximated by a linear model, linear regression will outperform tree based model. No doubt in this aspect. If the realationship is not linear then tree model is better to choose as lot of complicated transformation might be required on the independent variables to make the relationship linear.
  2. If there is a higher degree of non-linearity between dependent & independent variables, a tree model will perform better than Linear Regression Model. How do you check the linearity? Simply create the bivariate plot of dependent variable and independent variables and study the plots to determine what kind of relationship is between Y and the chosen X variable.
  3. Decision tree models do not require too much data cleaning (missing value and outlier effect). Hence easy and fast to develop and easy to explain to our customers as well.
  4. If your business problem demands the possible cause or path to reach to the target variable then tree is easy to explain whereas finding the nature of relationship of the predictor variables with the target variable Linear regression is a better choice.
  5. Decision tree models are even easier to interpret from a layman point of view.
read less
Comments
Dislike Bookmark

Lesson Posted on 22/12/2017 IT Courses/Data Science IT Courses/Machine Learning IT Courses/R Programming

Basics Of R Programming 1

Ashish R.

SAS certified analytics professionals, more than 11 years of industrial and 12 years of teaching experience....

# To know the working directory which is assigned by defaultgetwd()# set the working directory from where you would like to take the files setwd("C:/Mywork/MyLearning/MyStuddocs_UrbanPro/Data") # Assign the path as per the location where you want to allocate getwd() # to see the list of files in your... read more

# To know the working directory which is assigned by default
getwd()
# set the working directory from where you would like to take the files
setwd("C:/Mywork/MyLearning/MyStuddocs_UrbanPro/Data") # Assign the path as per the location where you want to allocate

getwd()

# to see the list of files in your working directory- just assigned above
dir() ## Lists files in the working directory

# Creating a folder in C drive
dir.create("C:/Mywork/MyLearning/MyStuddocs_UrbanPro/Data/Nov26")


#install.packages("car")
#install.packages("Hmisc")
#install.packages("reshape")
#install.packages('pastecs')
#install.packages('gtools')
#install.packages('gmodels')
#install.packages('caret')
#install.packages('MASS')


##-----------------------------------------------------------
## Load required libraries
##-----------------------------------------------------------
# calling the libraries in each active session is very much required
#if we want to use the functions in the library

library(foreign)
library(MASS) # for stepAIC()
library(Hmisc) # for describe()
library(boot)
library(pastecs) # for stat.desc()
library(gmodels)
library(gtools)
library(lattice)
library(ggplot2)
library(caret)
library(car)
library(foreign)
library(reshape)
library(Hmisc)

version # to check what version u are using

# import world data set
world

dim(world) # check how many rows and columns

View(world) # to View the data frame

trans<-read.csv("TransactionMaster.csv")

View(trans)

cust<-read.csv("CustomerMaster.csv")

View(cust)

dim(cust)

str(cust) # to check the structure/meta data of the data frame

# carbon copy of the file

cust_copy<-cust[,]

#save as a R file

saveRDS(cust_copy,"C:/Mywork/MyLearning/MyStuddocs_UrbanPro/Data/customerdata")

# take a sample of 100 rows and all the columns and create a sample file
# 1:100 stands for 100 rows and after comma blank means all columns to pick up
cust_sample<-cust[1:100,]

dim(cust_sample)


# take all the rows and specific columns from teh source file "cust"
samplefile

# take all rows and specific column numbers 1,8,9
samplefile

# do the frequency distribution of the City variable
table(cust$City)

# do a cross table freqency distribution of City and State variable
table(cust$State,cust$City )

 

table(world$deathCat, world$birthCat)


# calculate average value of energy_use_percapita variable from the world
mean(world$energy_use_percapita, na.rm=T)

#calculate median value of gni_per_capita
median(world$gni_per_capita) # 50th percentile


# to check the type of the R objects
class(world)
class(cust)
class(trans)

is.vector(world)
is.factor(world)
is.data.frame(world)
is.matrix(cust)

length(world) # display the number of cloumns : partcularly use for vectors

head(trans) # display first 6 rows in console

head(trans, n = 2) # Display top 2 rows

tail(trans) # display last 6 rows of a data frame

tail(trans,n=1)

firstfewrows

View(firstfewrows)


# to store the country names in lower case letters

world$country_name<-tolower(world$country_name)

# dropping the first column from a data frame and create a new one

world_1<-world[,-c(1)]

# filter out the atlanta customers

atlantaCustomers


# filter out atlanta or hollywood customers : | OR operator & AND opearator

atlantaHollyCustomers <-cust[which(cust$City == "ATLANTA" | cust$City == "HOLLYWOOD" ) , ]

## Selecting specific cloumns
atlantaCustomers1


# filtering out data with multiple conditions

highSales_mod<-trans[which(trans$Sales_Amount >= 100 & trans$Sales_Amount <= 150 ),]


max(highSales_mod$Sales_Amount)

min(highSales_mod$Sales_Amount)

###------------------------------------------------------------
### Basic Date functions in R
###------------------------------------------------------------

Sys.Date() # Current date

today

class(today)

Sys.time() # Current date and time with time zone
time<-Sys.time()

class(time)

 

read less
Comments
Dislike Bookmark

Answered on 11/12/2017 IT Courses IT Courses/Data Science

Sniffer Search

Python is leading popularity wise and number of job in data science in 2017. BI people have statistics background which help to get fundamental understanding. Now need to understand some python package like numpy,skipy,pandas ,matplotlib and start implementing with python and machine learning algo... read more
Python is leading popularity wise and number of job in data science in 2017. BI people have statistics background which help to get fundamental understanding. Now need to understand some python package like numpy,skipy,pandas ,matplotlib and start implementing with python and machine learning algo which are already implemented in python package .Do analysis,data wrangling,exploration,implement on train data some fit algorithm and test for accuracy. read less
Answers 8 Comments
Dislike Bookmark

Looking for Data Science Classes

Find best Data Science Classes in your locality on UrbanPro.

FIND NOW

Answered on 30 Jan Tuition/Class IX-X Tuition Tuition/Class IX-X Tuition/Science IT Courses/Data Science

Sujoy D.

Tutor

Newton's First Law states that an object will remain at rest or in uniform motion in a straight line unless acted upon by an external force. It may be seen as a statement about inertia, that objects will remain in their state of motion unless a force acts to change the motion. 2. Newton's second law... read more

Newton's First Law states that an object will remain at rest or in uniform motion in a straight line unless acted upon by an external force. It may be seen as a statement about inertia, that objects will remain in their state of motion unless a force acts to change the motion. 2. Newton's second law of motion can be formally stated as follows: The acceleration of an object as produced by a net force is directly proportional to the magnitude of the net force, in the same direction as the net force, and inversely proportional to the mass of the object. 3. Newton's third law is: For every action, there is an equal and opposite reaction. The statement means that in every interaction, there is a pair of forces acting on the two interacting objects. The size of the forces on the first object equals the size of the force on the second object.

Newton's laws of motion are three physical laws that, together, laid the foundation for classical mechanics. They describe the relationship between a body and the forces acting upon it, and its motion in response to those forces. They have been expressed in several different ways, over nearly three centuries and can be summarised as follows. First law: In an inertial reference frame, an object either remains at rest or continues to move at a constant velocity, unless acted upon by a net force. Second law: In an inertial reference frame, the vector sum of the forces F on an object is equal to the mass m of that object multiplied by the acceleration a of the object: F = ma. Third law: When one body exerts a force on a second body, the second body simultaneously exerts a force equal in magnitude and opposite in direction on the first body.

read less
Answers 45 Comments
Dislike Bookmark

About UrbanPro

UrbanPro.com helps you to connect with the best Data Science Classes in India. Post Your Requirement today and get connected.

Overview

Questions 16

Lessons 30

Total Shares  

Follow 6,879 Followers

Top Contributors

Connect with Expert Tutors & Institutes for Data Science

x

Ask a Question

Please enter your Question

Please select a Tag

UrbanPro.com is India's largest network of most trusted tutors and institutes. Over 25 lakh students rely on UrbanPro.com, to fulfill their learning requirements across 1,000+ categories. Using UrbanPro.com, parents, and students can compare multiple Tutors and Institutes and choose the one that best suits their requirements. More than 6.5 lakh verified Tutors and Institutes are helping millions of students every day and growing their tutoring business on UrbanPro.com. Whether you are looking for a tutor to learn mathematics, a German language trainer to brush up your German language skills or an institute to upgrade your IT skills, we have got the best selection of Tutors and Training Institutes for you. Read more