IBM InfoSphere DataStage Essentials v11.7

This course enables the project administrators and ETL developers to acquire the skills necessary to develop parallel jobs in DataStage v11.7. The emphasis is on developers. Only administrative functions that are relevant to DataStage developers are fully discussed. Students will learn to create parallel jobs that access sequential and relational data and combine and transform the data using functions and other job components.

Retail Price: $3,260.00

Next Date: Request Date

Course Days: 4


Request a Date

Request Custom Course


Objectives

  • Describe the uses of DataStage, DataStage clients, and the DataStage workflow
  • Describe the two types of parallelism exhibited by DataStage parallel jobs
  • Describe what a deployment domain consists of, the different domain deployment options, and the installation process
  • Create new users and groups
  • Assign Suite roles and Component roles to users and groups
  • Give users DataStage credentials
  • Add a DataStage user on the Permissions tab and specify their role
  • Specify DataStage global and project defaults
  • List and describe important environment variables
  • Navigate the DataStage Designer
  • Import and export DataStage objects
  • Design a parallel job in DataStage Designer
  • Use the Row Generator, Peek, and Annotation stages in the job
  • Compile, run, and monitor a job
  • Create a parameter set and use it in a job
  • Read and write to sequential files using the Sequential File stage
  • Work with nulls in sequential files
  • Read from multiple sequential files using file patterns
  • Describe parallel processing architecture, pipeline parallelism, and partition parallelism
  • Describe partitioning and collecting algorithms
  • Describe the parallel job compilation process and how to use OSH (Orchestrate Shell Script)
  • Explain the Score
  • Combine data using the Lookup stage
  • Combine data using the Merge, Join, and Funnel stages
  • Sort data using in-stage sorts and the Sort stage
  • Combine data using the Aggregator stage and the Remove Duplicates stage
  • Use the Transformer stage in parallel jobs
  • Define constraints and derivations
  • Create a parameter set and use its parameters in constraints and derivations
  • Perform a simple Find, Advanced Find, and an impact analysis
  • Compare the differences between two table definitions and two jobs
  • Import table definitions for relational tables
  • Use ODBC and Db2 Connector stages in a job
  • Use SQL Builder to define SQL SELECT and INSERT statements
  • Use multiple input links into Connector stages to update multiple tables within a single transaction
  • Use the DataStage job sequencer to build a job that controls a sequence of jobs
  • Use Sequencer links and stages to control the sequence a set of jobs run in
  • Pass information in job parameters from the master controlling job to the controlled jobs
  • Handle errors and exceptions

Audience

This is a basic course for project administrators and ETL developers responsible for data extraction and transformation using DataStage.

Prerequisites

You should have basic knowledge of the Windows operating system and some familiarity with database access techniques.


Course Outline

1. Data Quality Issues
• Listing the common data quality contaminants
• Describing data quality processes

2. QualityStage Overview
• Describing QualityStage architecture
• Describing QualityStage clients and their functions

3. Developing with QualityStage
• Importing metadata
• Building DataStage/QualityStage Jobs
• Running jobs
• Reviewing results

4. Investigate
• Building Investigate jobs
• Using Character Discrete, Concatenate, and Word Investigations to analyze data fields
• Reviewing results

5. Standardize
• Describing the Standardize stage
• Identifying Rule Sets
• Building jobs using the Standardize stage
• Interpreting standardize results
• Investigating unhandled data and patterns

6. Match
• Building a QualityStage job to identify matching records
• Applying multiple Match passes to increase efficiency
• Interpreting and improving Match results

7. Survive
• Building a QualityStage survive job that will consolidate matched records into a single master record

8. Two-Source Match
• Building a QualityStage job to match data using a reference match



Sorry! It looks like we haven’t updated our dates for the class you selected yet. There’s a quick way to find out. Contact us at 502.265.3057 or email info@training4it.com


Request a Date