Signup as a Tutor

As a tutor you can connect with more than a million students and grow your network.

Talend Data Quality

No Reviews Yet

Course type: Online Instructor led Course

Platform: GoToMeeting

Course ID: 28743

Course type: Online Instructor led Course

Platform: GoToMeeting

Raj picture
Raj
No Reviews Yet

Date and Time

Not decided yet.

Raj picture
Raj

Informatica & Oracle Certified Professional

About Raj

Expert in ETL, Data Integration, Data Quality & Metadata Management
No reviews currently

About the Course

This course is designed to enable users to evaluate the quality of data in the information system according to a set of metrics and thresholds based on a series of indicators, models and rules for each data item to be analysed or monitored. And also to use tools to isolate, correct and monitor non-compliant values in a dataset. It combines the Talend Data Quality Basics training contents with other contents covering the collaborative work and the correction of data issues. This course explains how to set up automated tasks to fix the most common data errors, while routing other errors for manual assessment and correction.

Topics Covered

Module #1Data Quality (DQ) Basics
Lesson 1 - Creating Connections
Creating a database connection
Create a file delimited connection

Lesson 2 - Connection and Catalog Analysis
Connection Analysis (database structure overview)
Catalog Analysis (catalog structure overview)

Lesson 3 - Column Analysis
Running a Column Analysis
Adding pattern analysis (regular expressions)
Analyzing other tables
Adding threshold indicators
Column Set Analysis

Lesson 4 - Table Analysis
Match Analysis
Business Rule Analysis
Adding indicator thresholds
Functional Dependency Analysis

Lesson 5 - Redundancy Analysis
Column Content Comparison (foreign key/primary key)

Lesson 6 - Column Correlation Analysis
Numerical Correlation Analysis
Time Correlation Analysis
Nominal Correlation Analysis

Lesson 7 - Task Management
Create, view, complete and delete tasks

Module #2Data Quality (DQ) Advanced
Lesson 1 - Starting Talend & Retrieve Schemas
Starting Talend Studio
Retrieve database schemas
DQ Components

Lesson 2 - Identify Invalid Data
Run a Column Analysis to extract invalid data
Generate an ELT Job from analysis results

Lesson 3 – Data Parsing
Parsing data (split single column into component elements)

Lesson 4 – Create a Lookup Table
Creating a lookup table from a Column Analysis

Lesson 5 – Standardize Data
Creating a Job to standardize data
Use two lookup tables and key components (tFuzzyMatch and tMap)

Lesson 6 – Identify Duplicate Records
Match Analysis to Identify duplicate records
Export Match Rule
Use Match Rule in a DI Job to identify duplicates

Lesson 7 - Resolve Conflicts
Build a Job that writes duplicates to the Data Stewardship Console (DSC) database as Tasks
Use the DSC to resolve duplicates
Build a Job to update the database (with golden records)

Lesson 8 - Manage your Workspace
Exporting Jobs
Exporting Analysis and rules
Importing into a new project

Lesson 9 - Reports
Configure the Data Quality database
Single and multiple reports
Basic troubleshooting and configuration options
Evolution reports

Lesson 10 – Monitoring data quality
Generate sample data and reports
Run reports from the Data Quality Portal

Who should attend

Any technology aspirant

Pre-requisites

Basic knowledge in computing, including familiarity with Java or another programming language, as well as SQL or other general concepts of databases.

What you need to bring

Good internet connectivity (2 MBPS) and machine with minimum of 4 GB RAM is required for installation and configuration.

Key Takeaways

Data Quality Expertise

Reviews

No reviews currently

Discussions

Post your requirement and let us connect you with best possible matches for Data Science Classes Post your requirement now
₹ 12,000 Enquire