Merge Two Data in ETL project of SSIS

Before reading this tutorial, please go through the first article on creating ETL project in Business Intelligent  (BI) tool of visual studio for SSIS.

For this tutorial we will need two data sources. I am taking one flat file and another excel file so that new user can understand that extraction can be done from different types of datasources.

Text file contains “CustomerId, Subscription Start Date, Subscription End Date” and Excel file Contains “CustomerId, Segment, CustomerIdSpace” columns.

Observe that both data sources contains one common column named “CustomerId”.  For a while don’t count the column “CustomerIdSpace” of Excel Data Source.

Create New Business Intelligent project in Visual Studio 2005 and Drag “Data Flow Task” from tools to package.

Double click on “Data Flow Task” and new tab will opened. Drag “Flat File Source” and “Excel source” from tool box.

Double click on “Flat File Source” and “Excel Source” and create new Connection for excel file and text file respectively.

Drag two “Sort” control from “Data Flow Transformation” section of the toolbox below both sources.

Double click on “Sort” control and select the column “CustomerId” for both “Sort” Control.

Now Drag the “Merge Join” control from tool box and drop green arrow from both “Sort” Control to the “Merge Join” control. Select the type of Join, in this case we have selected “inner join“, also select the columns which should be exported in Output .

Here, we are not going to write the result in file, but we will use “derived column” control after merge join and add “data Viewer” as discussed in previous article to view the output.

The final snapshot of the ETL package will look like:

Merge Two Data in ETL project of SSIS
Merge Two Data in ETL project of SSIS

Merge Data from different sources in which the common column is not well formatted:

In above example, we have considered that “CustomerId” in both sources have same value. But what will happen if the column is not same and needs some modification. for example if one source have extra space in values of column “CustomerId”.

In Excel file we have one column named “CustomerIdSpace” which i said to forget in previous section.

To work in this type of situation, before sorting data we will need to change/format the inconsistent column. here we will need “Derived Coulmn” control from the toolbox.

Derived Coulmn Transformation - Trim Column Values
Derived Coulmn Transformation – Trim Column Values

As you can see in formula editor of derived column , one new column is added named as “Removed Id” and the value is calculated by expression TRIM(CustomerIdSpace).

The Final snapshot of the package is shown in below image:

Merge Two InConsistent Data in ETL project of SSIS
Merge Two InConsistent Data in ETL project of SSIS

Download source code for Merge Two Data in ETL project of SSIS

Posted

in

by

Tags:


Related Posts

Comments

2 responses to “Merge Two Data in ETL project of SSIS”

  1. ashish Avatar
    ashish

    hi,
    I’m taking two values from one excel file because of formatting of excel…
    now i have two excel source one is giving me date and 2nd is giving me price.. now i merge them using merge.. but i want to insert those 2 values into one table… i did and take both values into derived column then also it is performing 2 insertion 1st time date 2nd time price differently i want all those 2 in one row only

  2. Vennila Avatar
    Vennila

    Superb

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Discover more from Jitendra Zaa

Subscribe now to keep reading and get access to the full archive.

Continue Reading