Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. Today some systems may store data in a denormalized form and data integration tools able handle those. In this blog post Talend will be used to show case handling simple denormalized data set file.

For example, system stores state data with the following schema: [filed1];[[filed2.1],[filed2.2]] Schema is mapping [StateID];[[StateName],[PostCode]]. Here is the sample file ‘states.csv’.

StateID;StateName,PostCode
1;Alabama,009234
2;Alaska,009235
3;Arizona,009236
4;Arkansas,009237
5;California,009244
6;Colorado,009245
7;Connecticut,009214
8;Delaware,009278
9;Florida,0092897
10;Georgia,009247

Start Development in Talend Studio

  1. Drop the following components from the Palette onto the design workspace: tFileInputFullRow, tExtractDelimitedFields, and tLogRow.
  2. Connect them using the Row Main links.

image


Configuring the components

1. Double-click the tExtractDelimitedFields component to open its Basic settings view. Add the file path and Skip the header line.

image

Update the schema as below

image

2. Double-click the tFileInputFullRow component to open its Basic settings view. Edit the schema

image

3.  Double-click the tLogRow component to open its Basic settings view. Edit the schema

image


Running

1. Save it and press ‘F6’

image

0

Add a comment

I am
I am
Archives
Total Pageviews
Total Pageviews
2 0 5 7 7 0 6
Categories
Categories
Loading