Talend training, managing data quality version 2022

Presentation

Open Studio for Data Quality, one of the leading open source data profiling tools on the market. You will learn how to effectively manipulate this Talend tool to assess the level of data quality in your information system. You will implement analyses, measure data compliance to internal or external standards and define strategies to clean up erroneous data.

Pedagogical objectives

Upon completion of the training, the participant will be able to:
  • Connect to data sources, produce statistics, identify data to be profiled
  • Choose the different types of indicators and analyses adapted to the data to be monitored
  • Implement complex analyses to verify business rules
  • Define strategies for correcting data errors via Talend Data Integration jobs

Target public

Business analysts, data integrators, data managers.

Requirements

Good knowledge of relational databases and SQL. Basic knowledge of Talend Open Studio for Data Integration.

Price

  • 2000€ HT per person.

Training program

  • Assessing the quality of data in an information system.
  • Core criteria: completeness, accuracy and integrity of data.
  • Positioning of the Talend Open Studio for Data Quality product in the Talend suite.
Practical works
Installation of the product, configuration of preferences.
 

The fundamental concepts of TOS for Data Quality

  • Metadata: connections to databases, delimited files and Excel files.
  • Presentation of the different types of analysis.
  • Tools and indicators to assist in conducting analyses.
  • The data explorer.
Practical works
Perform a first column analysis on data from a csv file, exploitation of the results obtained.
  • Duplicate search, respect of interval constraints, date format, email...
  • Metrics of a table, functional dependencies between columns.
  • Identification of value redundancies.
  • Consistency checks between foreign and primary keys.
  • Use indicators, templates, rules and source files.
Practical works
Perform an analysis of each type on a partially erroneous data set.
 

Advanced analysis

  • Analysis of schema and table structure via the data explorer.
  • Multi-table and multi-column analysis, respect of business rules.
  • Search and visualization of correlation between columns.
  • Create your own indicators and source files.
  • Manage analyses.
Practical works
Create a complex business rule involving several tables and associate it with a task. Publish the rule in the Talend forge.
  • Use context variables.
  • Create templates based on regular expressions.
  • Export/import analyses and analyzed data.
  • Correct erroneous data with Talend Data Integration.
Practical works
Set up metadata and analysis using context variables, export analyzed data for correction in Talend Data Integration.

Pedagogical methods

Practical training: 70% Practical, 30% Theory.
Training material distributed in digital format to all participants.
Access to servers and databases as well as PCs are provided for practice.

Evaluation method

The evaluation of the objectives is done throughout the session through multiple exercises (70% of time).

Instructor

Our training is provided by Mohand LARABI, PhD in computer science and Talend expert.

Organization

Classes start at 9am until 12:30pm and then from 2pm until 5:30pm. That is 7 hours per day.

Location and dates of the sessions

26 avenue Perrichont 75016 Paris

16 au 18 Mai 2022(inclus)​

26 avenue Perrichont 75016 Paris

20 au 22 Juin 2022 (inclus)​

CUSTOMER NOTICES

Satisfaction

100
%

Attendance

95
%

Contact us

125 rue Michel Ange 75016 Paris

Home

+33(0)142307782