About this task
         Defining a blocking key is not mandatory but strongly advisable. Using a blocking key to
            partition data in blocks reduces the number of records that need to be examined as
            comparisons are restricted to record pairs within each block. Using blocking columns is
            very useful when you are processing a big dataset.
       
      - 
            In the Data section, click the Select Blocking
                  Key tab.
         
 
- 
            Click the name of the columns you want to use to partition the processed data in
               blocks.
            
Blocking keys that have the exact name of the selected columns are listed in the
                  Blocking Key table.
            
            You can define more than one column in the table, but only one blocking key will
               be generated and listed in the BLOCK_KEY column in the
                  Data table.
            For example, if you use an algorithm on the country and
                  lnamecolumns to process records that have the same first
               character, data records that have the same first letter in the country and last names
               are grouped together in the same block. Comparison is restricted to record within
               each block.
            To remove a column from the Blocking key table, right-click
               it and select Delete or click on its name in the
                  Data table.
          
- 
            Select an algorithm for the blocking key, and set the other parameters in
                        the Blocking Key table as needed.
            
In this example, only one blocking key is used. The first character of
                        each word in the country column is retrieved and listed
                        in the BLOCK_KEY column.
          
- 
            Click Chart to compute the generated key,
                        group the sample records in the Data table
                        and display the results in a chart.
            
This chart allows you to visualize the statistics regarding the number of
                        blocks and to adapt the blocking parameters according to the results you
                        want to get.