Defining the survivor validation flow
About this task
Having configured and grouped the input data, you need to create the survivor validation flow using tRuleSurvivorship. To do this, proceed as follows:
Procedure
-
Double-click tRuleSurvivorship to open
its Component view.
- Select GID for the Group identifier field and GRP_SIZE for the Group size field.
- In the Rule package name field, enter the name of the rule package you need to create to define the survivor validation flow of interest. In this example, this name is org.talend.survivorship.sample.
-
In the Rule table, click the plus button to
add as many rows as required and complete them using the corresponding rule
definitions. In this example, add ten rows and complete them using the table
below:
Order
Rule name
Reference column
Function
Value
Target column
Sequential
"1_LengthAcct"
acctName
Expression
".length >11"
acctName
Sequential
"2_LongestAddr"
addr
Longest
n/a
addr
Sequential
"3_HighCredibility"
credibility
Expression
"> 3"
credibility
Sequential
"4_MostCommonCity"
city
Most common
n/a
city
Sequential
"5_MostCommonZip"
zip
Most common
n/a
zip
Multi-condition
n/a
zip
Match regex
"\\d{5}"
n/a
Multi-target
n/a
n/a
n/a
n/a
state
Multi-target
n/a
n/a
n/a
n/a
country
Sequential
"6_LatestPhone"
date
Most recent
n/a
phone
Multi-target
n/a
n/a
n/a
n/a
date
Do not use special characters in rule names, otherwise the Job may not run correctly.These rules are executed in the top-down order. The Multi-condition rule is one of the conditions of the 5_MostCommonZip rule, so the rule-compliant zip code should be the most common zip code and meanwhile have five digits. The zip column is the target column of the 5_MostCommonZip rule and the two Multi-target rules below it add another two target columns, state and country, so the zip, the state and the country columns will be the source of the best-of-breed data. Thus once a zip code is validated, the corresponding record field values from these three columns will be selected.The same is true to the Sequential rule 6_LatestPhone. Once a date value is validated, the corresponding record field values will be selected from the phone and the date columns.Information noteNote:In this table, the fields reading n/a indicate that these fields are not available to the corresponding Order types or Function types you have selected. In the Rule table of the Basic settings view of tRuleSurvivorship, these unavailable fields are greyed out. For further information about this rule table, see the properties table at the beginning of this tRuleSurvivorShip section.
-
Next to Generate rules and survivorship
flow, click the
icon to generate the rule package with its contents you
have defined.
Once done, you can find the generated rule package in the Metadata > Rules Management > Survivorship Rules directory of Talend Studio Repository. From there, you are able to open the newly created survivor validation flow of this example and read its diagram. For further information, see Managing a survivorship package.
Did this page help you?
If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!