Course intended for:

The training is aimed at developers, architects and application administrators who want to create or maintain ETL (extract, transform, and load) processes using the Pentaho Data Integration (PDI). The training is also directed to people who want to supplement their knowledge of concepts related to data warehouses (DWH) and their implementation using the Pentaho Business Intelligence Suite.

Course objective:

Course participants will learn how to design, implement, monitor, start-up, tune ETL processes. After the training, participants will be able to choose the right set of tools and techniques for their projects. In addition to a general introduction to DWH concepts, training is focusing on the Pentaho Business Intelligence Suite and the Pentaho Data Integration (PDI).

Course strengths:

The training program includes both general introduction to the subject of ETL and DWH, as well as the overall presentation of the Pentaho Data Integration product stack. The training is unique because its subject is not fully recognized in the literature and knowledge of ETL and PDI is highly fragmented. Training program is constantly updated due to a rapid development of ETL solutions.


The participants are required basic knowledge of databases, basic programming skills in Java.

Course parameters:

3*7 hours of lectures and workshops at a ratio of 1:3. During workshops, in addition to simple exercises, participants will solve problems by implementing its ETL processes, model DWH data structures, perform basic administrative tasks. Group size: max. 8-10 people.

Program szkolenia

Course curriculum

  1. Introduction

    1. Data warehouses basic concepts:

      1. OLTP, OLAP, database, data mart, data warehouse


      3. Normalization, aggregation, facts, dimensions

      4. SQL, MDX, XML/A

      5. ETL

      6. BigData, BigTable, NoSQL, non-relational databases and data warehouses

      7. Others

    2. Pentaho BI Suite

  2. ETL

    1. Extraction of data

    2. Transformation, cleaning, replenishment od data

    3. Loading

    4. Data quality

    5. Staging

    6. Real-time DWH

    7. ETL performance problems

    8. ETL tools

  3. Pentaho Data Integration

    1. Architecture

      1. Kettle

      2. Spoon

      3. Pan

      4. Kitchen

      5. Carte

  4. Working witch Spoon

    1. Installation, starting up, look & feel

    2. Variables

    3. Hops

    4. Working with XML files and repositories

    5. Data sharing

  5. Transformations

    1. Working with data sources

      1. Inputs and Outputs

      2. Table input/output

      3. Text file input/output

      4. XML file input/output

      5. Deserialize from/Serialize to

      6. Others

    2. Validation

      1. Data Validator

      2. XSD Validator

      3. Others

    3. Replenishment

      1. Database/Web service/Stream lookup

      2. HTTP/REST client

      3. Combination lookup/update

      4. Dimension lookup/update

      5. Others

    4. Transformation

      1. Transform

      2. Joins

      3. Mapping

      4. Flow

      5. Filter

    5. Optimization

      1. Bulk loading

      2. Statistics

      3. Parallel processing

      4. Partitioning

      5. Clustering

    6. Custom code

      1. Java Expression, Java Class

      2. Java Script

      3. SQL Script

      4. Regex

    7. Utilities

      1. Syslog

      2. Mail

      3. SSH

      4. Others

    8. Monitoring

    9. Versioning

  6. Jobs

    1. Jobs (kjb) and transformations (ktr)

    2. Complex jobs

    3. Custom code

      1. Java Script

      2. SQL Script

      3. Shell

    4. Workflows

      1. Conditions
    5. Files

      1. XML

      2. File transfer

      3. File encryption

      4. File management

    6. Monitoring

    7. Versioning

  7. Kitchen and Pan

    1. Running jobs and transformations

    2. Scheduling

    3. Error handling

    4. IO redirection

  8. Cartle

    1. Running jobs and transformations remotely
  9. Pentaho Data Integration Marketplace and Pentaho Data Integration Plug-Ins

Any questions?

* Required.

Phone +48 22 2035600
Fax +48 22 2035601