Pages

Thursday, November 14, 2013

SSIS Synchronous and Asynchronous Transformations (Non-blocking, semi-blocking and fully blocking transformations)


Synchronous and Asynchronous Transformations

In this post we will understand the Synchronous and Asynchronous Transformations which are also categorized into Non-blocking, semi-blocking and fully blocking transformation.

In SSIS transformations are classified into Synchronous and Asynchronous depending on the way they work with data processing. To understand this in simpler way, transformations which process the incoming rows without depending on other input rows are considered as synchronous transformations which actually means the number of output rows is equal to number of input rows and also output of the transformation is in synchronous with input.

For example the Data Conversion transformation acts on each and every single input row, performs the desired conversion and passes the row to next step in data flow.

Transformations which are not synchronous are Asynchronous (if not yes then it is noJ).  To understand this in simpler way, transformations which cannot process incoming rows independently are considered as asynchronous transformations which actually means the number of output rows might not be equal to number of input rows or output is not in synchronous with input.

For example the Sort transformation will wait for all the rows to be feed into it before it passes the rows to the next step in data flow, also in this case number of rows of input will be equal to output provided “remove rows with duplicate sort values” option is not checked, if this option is checked then number of output rows might be less than or equal to input rows based on the duplicates. This behavior also implies that Sort transformation blocks the existing processing till all rows are completely processed.

Another example for asynchronous transformation is Merge Join transformation which receives data from 2 sorted sources/transformations and will hold the rows till one set of matching rows from both sources/transformations are processed then it passes the rows to next step in data flow whilst continuing the processing of data from sources/transformations. This behavior implies that Merge Join transformation partially blocks the existing processing.

Let’s see what ever we have understood above is practically correct or not. In the below screen shot there are 3 independent data paths with Data Conversion, Sort and merge join transformations respectively.


By above theory we understood that Synchronous transformation processes the rows in synchronous manner which implies the row feed into transformation is processed and passed to the next step at the same time (Actually speaking we have to refer to buffer of rows rather than single row, that is the reason we see always set(one full buffer) of rows getting processed, we will see about data buffers  in detail in another post.), so the synchronous transformations are also known as non-blocking transformations. 

The left side box in the above screen shot shows us the synchronous transformation where we can obviously note that number of input rows is exactly equal to number of output rows
The right side box has asynchronous transformations which further bifurcated into partially/semi-blocking transformations and fully blocking/blocking transformations.
·         From the above theory for sort transformation we understand that it do not pass any rows to next step in data flow till it process all the rows. There are 121,317 rows in the source table so this transformation waits till all the rows from the source are feed into it and then it will pass the processed rows to next step in the data flow  We can obviously see that in the below screen shot.

·         Now let us see the behavior of merge join transformation, as we understand from above theory that this transformation do not block the processing completely and passes the processed rows to next step in the data flow. From the above screen shot we can see that a set of 9972 rows from one source and set of 19944 rows from other set are feed into Merge join transformation and we can also see that a set on 9972 rows are passed to next step in data flow as there are processed hence called semi-blocking.



With your understanding of the concept try to find categorize the SSIS transformations into Non-blocking, semi-blocking and fully blocking transformation.

Request you to provide your comment and feedback.


Singing off
Yours RBK

No comments:

Post a Comment