Offering 48 hours course in Data Science & Gen AI for engineering & MBA students.
Course delivered by industry expert.
Course Curriculum
Module1: Python Basics: It will help learn the tool, Python to be used for working with data-2h
- Introduction to Python
- OOP: Object & Class
- Serialization: Pickle Library
- Variables
- Lists
- Tuples
- Dictionary
- Sets
- List and Dictionary Comprehensions
- Conditional Statements (If, If-else,elif)
- Loops (For, While)
- Functions
- Lambda Function
- Apply Function
Class Exercises
Module2: Python NUMPY Library: It is used to perform a wide variety of mathematical operations on arrays-1h
- Array Characteristics
- Array Creation (arrange, linspace, flatten)
- Array Indexing (Slicing)
- Array Manipulation
- Reshape
- Concatenate
- Append
- Insert
- Delete
- Transpose
- Class Exercises
Module3: Python PANDAS Library: It is used for data manipulation, data cleaning, data analysis-2h
- Series
- Data Frames
- Reading csv file
- Sub Setting / Filtering / Slicing Data
- Dropping rows & columns
- Adding/Deleting columns
- Binning
- Renaming columns or rows
- Sorting
- Data type conversions
- Handling duplicates /missing
- Broadcasting
- Group by Function
- Map Function
- Visualization (bar graph, histogram, box plot)
- Merging (Inner, Left, Right, Outer)
- EDA
Class Exercises
Module4: Python MATPLOTLIB Library: Data Visualization part1 -1h
- Bar Plot
- Stacked Bar Plot
- Histogram
- Line Chart
- Box plot
- Pie-Chart
Class Exercises
Module5: Python SEABORN Library: Data Visualization part2 -1h
- Bar Plot
- Histogram
- Pairwise Plots: Joint Plot, Pair Plot
- Categorical Scatter Plot: Strip-plot, Swarm-plot
- Box-Plot
- Violin Plot
- Cat Plot
- Facet Grid
- Pair Grid
- Line Plot
Class Exercises
Module6: Basic Statistics: For business analysis-1h
- Type of Data
- Statistics
- Type of Statistics
- Descriptive Statistics
- Mean, Median, Mode (Measures of Central Tendency)
- Standard Deviation, Variance (Measures of Dispersion)
- Normal Distribution
- Standard Normal Distribution
- Standard Error
- Sampling
- Probability
Class Exercises
Module7: Advance Statistics: For business analysis-1h
- Confidence Interval
- T-Test & Z-Test
- P-value
- Hypothesis Testing
- Type I Error & Type II Error
- Chi-Square Test
- ANOVA
- Covariance
- Correlation
Class Exercises
Module8: Supervised Machine Learning: Linear Regression (Solve business problems where we have to predict a value)-2h
- Introduction
- Assumptions (Linearity, Hetroskedasticity, Multivariate Normality, etc)
- Data Preparation (Outlier Treatment, Missing Value Imputation)
- Building Linear Regression Model
- Understanding model metrics (p-value, R-square/Adjusted R-square etc)
- Multicolinearity (VIF)
- Model Validation (MAPE,RMSE)
- Case study
Module9: Supervised Machine Learning: Logistic Regression (Used for binary classification business problems)-2h
- Introduction
- Linear Regression Vs. Logistic Regression
- Data Preparation (Outlier Treatment, Missing Value Imputation, Dummy Variable Creation)
- Building Logistic Regression Model
- Understanding model metrics (p-value)
- Multicolinearity (VIF)
- Model Validation (Confusion Matrix, ROC curve, AUC, etc)
- Case study
Module10: Supervised Machine Learning: Decision Tress (Used for multi-class classification business problems & regression business problems)-2h
- Introduction
- Types
- Entropy, Gini Index, Chi-Square
- Overfitting
- Pruning
- Cross – Validation
- Case study
Module11: Supervised Machine Learning: Ensemble (Used for multi-class classification business problems & regression business problems)-2h
- Introduction
- Bagging
- Random forest
- Boosting
- Gradient Boosting Machines (GBM)
- Case study
Module12: Supervised Machine Learning: KNN (Used for multi-class classification business problems & regression business problems)-2h
- Introduction
- Working of KNN
- Optimal value of K
- Case study
Module13: Unsupervised Machine Learning: Clustering (Used for segmenting data points into different groups)-1h
- Introduction
- K -Means Clustering
- Cluster Evaluation and Profiling
- Case study
Module14: Unsupervised Machine Learning: PCA (Used for segmenting data points into different groups)-1h
- Introduction
- Curse of dimensionality
- Process of working
- Case study
Module15: Unsupervised Machine Learning: Isolation Forest (Used for Anomaly detection/ Fraud detection)-1h
- Introduction
- Contamination Factor
- Case study
Module16: Time Series Forecasting: Used for inventory planning or forecasting future value-3h
- Introduction
- Time Series Components: Trend, Seasonality, Cyclicity
- Smoothening Techniques– Moving Averages, Exponential
- ARIMA
- Accuracy
- Neural Prophet
- Case study
Module17: Text Analytics: Used for text mining business problems working with unstructured data-3h
- Introduction
- Text Pre-processing
- Noise Removal
- Lemmatization
- Stemming
- Feature Engineering on Text Data
- Bag of words
- TF-IDF
- Case study
Module18: AI: Deep Learning, Keras-4h
- Introduction: Deep Learning
- Deep Learning vs Machine learning
- Neural Networks
- Activation Functions, hidden layers, hidden units
- Backpropagation
- Vanishing Gradient Problem
- Exploding Gradient Problem
- Perceptron & Multi-layer Perceptron
- CNN
- RNN
- Case study
Module19: Model Deployment: Using model for predicting output on new input values-2
- Flask
- Case study
Module20: Power BI: Data Visualization-7h
- Introduction
- Connection to Data Sources
- Power Query Editor
- Views: Report, Data, and Relationships
- Data Modelling
- Data Relationship
- DAX
- DAX queries: Calculated Columns & Calculated Measures
- Data Visualization Charts
- Slicers
- Dashboard
- Case study
Module21: Generative AI- 6h
- Introduction
- Large Language Models (LLM)-GPT
- Transformer Architecture
- Prompt Engineering
- Configuration
- Lang Chain Framework
- Use Cases
Module22: Project-1h
- Capstone Project
Course Duration: 48hours