Pre-requisites
There are no prerequisites for this course. Familiarity with Databases /SQL concepts will be beneficial.
Why learn Big Data For ETL and Data Warehouse?
Today, when data is mushrooming and coming in heterogeneous forms, there is a growing need for a flexible, adaptable platform. Talend fits just perfect in this space with a proven track record, making scope for vast opportunities. If you understand how to manage, transform, store your organization data (retail, banking, airlines, research, insurance, cards etc.) and effectively represent it, then you are the resource organizations are looking for.
Role of Open Source ETL Technologies in Big Data
Learning Objectives- In this module, you will get an overview on various products offered by Talend corporation till date and get familiar with the relevance to Data Integration and Big Data. Also basic ETL and DWH concepts, how talend fits in and how open source technologies are taking Big Data into next level. Zero to Pro in minutes is what Talend has to offer in Big Data arena.
Topics - About Talend corporation and their journey, Overviews on: TOS (Talend Open Studio) for Data Integration, TOS for Data Quality, TOS for Master Data Management, TOS for Big Data, ETL concepts, Data warehousing concepts, Quiz session.
2. Talend: A Revolution in Big Data
Learning Objectives - In this module, you will get familiar with the TOS for DI tool, GUI, what is where, what is what. You will also learn to setup talend (installation) and most frequent error encountered and how to fix them, Talend architecture, Hadoop is not a threat to ETL but they go hand in hand.
Topics - Why Talend, Features, Advantages, Talend Installation/System Requirements, GUI layout (designer), Understanding it's Basic Features, Comparison with other market leader tools in ETL domain, Important areas in Talend Architecture: Project, Workspace, Job, Metadata, Propagation, Linking components, Hands On: Creating a simple job and discussion about it, Quiz session.
3. Talend: Read & Write Various Types of Source/Target Systems
Learning Objectives - In this module, you will get acquainted with various types of source, target systems supported by Talend, Demo of popular CSV/Delimited file and fixed width file, How to read and write in this area, How to connect to Database and read/write/update data, How to read complex source system like Excel and XML.
Topics - Data Source Connection, File as Source, Create meta data, Database as source, Create metadata, Using MySQL database (create tables, insert, update data from talend), Read and write into excel files, into multiple tabs, View data, How to capture log and navigate around basic errors, Role of tLogrow and how it makes developers life easy, Quiz session, Hands on assignments.
4. Talend: How to Transform your Business: Basic
Learning Objectives - In this module, you will understand basic to advanced transformation components offered under TOS for DI.
You will also learn:
- How homogeneous/heterogeneous data sources talk with each other
- How to transform data patterns depending on business requirements
Topics - Using Advanced components like: tMap, tJoin, tFilter, tSortRow, tAggregateRow, tReplicate, tSplit, Lookup, tRowGenerator, Quiz session, Scenarios and assignments: How to join 2 sources and get matching from second source, rows to columns and columns to rows transformation, Remove Duplicates, Filter based on Business requirement.
Talend: How to Transform your Business: Advanced 1
Learning Objectives - In this module, you will learn to set dependencies between Jobs, Setting up parameters in Job, Use of Functions, Deploy jobs from development to production environment in realtime, Cross platform sharing with Talend (how to import and export information).
Topics - Trigger (types) and Row Types, Context Variables (paramaterization), Functions (basic to advanced functions to transform business rules such as string, date, mathematical etc.), Accessing job level / component level information within the job, Quiz session, Scenarios and assignments: How to search and replace errors in source data (Data Quality and cleansing), Job Trigger or Action (Possible scenario is “as soon as file arrives kick off a job”).
Talend: How to Transform your Business: Advanced 2
Learning Objectives - In this module, you will understand transformation and various steps: How to program looping in talend, How to search files in a directory and process one by one, Centralized error handling and debugging mechanism in talend.
Topics - Type Casting (convert datatypes among source-target platforms), Looping components (like tLoop, tFor), tFileList, tRunJob, How to schedule and run talend DI jobs externally (not in GUI), Quiz session, Scenarios and assignments: How to redirect errors in a job to central error loging which can be analysed later, How to create output files dynamically based on a field value in the source, How to read files in a directory (in loop) and process them one by one.
Big Data Concepts: Required for Talend for Big Data
Learning Objectives - In this module, you will understand the prior knowledge required in Hadoop in order to be comfortable while learning Talend for Big Data: Basics in Hadoop, HDFS (Hadoop Distributed File System) architecture Overview, MapReduce Concept Overview, Industry standards.
Topics - How module 1 to 6 will help in understanding and performing hands on Talend for Big Data and How Big Data will never be this easy to learn and use, Quiz session.
Introduction to Talend for Big Data
Learning Objectives - In this module, you will learn: TOS for BD means (Talend Open Studio for Big Data), How to setup Big Data environment on your machine, Big Data connectors in TOS for BD (Talend offers some 800+ connectors for Big Data environment), How to access HDFS from Talend.
Topics - Big Data setup using Hortonworks Sandbox in your personal computer, Explaining the TOS for Big Data Environment, Quiz session, Scenarios and assignments: Basic HDFS commands and Exploring in Sandbox, How to check connectivity to HDFS from Talend, How to read from HDFS in talend Job, How to write into HDFS from talend job.
Hive in Talend for Big Data
Learning Objectives - In this module, you will learn: What is Hive and concepts, How to setup Hive environment in Talend, Hive Big Data connectors in TOS for BD and Use Cases using Hive in Talend.
Topics - How to create and access Hive tables in talend, Process and Transform data from hive, Access data from Hive, transform and interact with MySQL tables, Quiz session, Scenarios and assignments: Hive connectors, Use cases using Hive in Talend.
Pig in Talend for Big Data and Project
Learning Objectives - In this module, you will learn: What is Pig and concepts, How to setup Pig environment in Talend, Pig Big Data connectors in TOS for BD, Use cases using Pig in Talend, Project Implementation, Conclusion.
Topics - Quiz session, Scenarios and assignments: Using Pig connectors, Setup, Use case using Pig scripting via Talend. Business requirements: Source/Target/Mapping will be provided and explained, Quiz session and Discussion.