| 
                         
                            Storage
                         
                     | 
               
                         To connect to an HDFS installation, select the Define a storage configuration component
                            check box and then select the name of the component to use from those
                            available in the drop-down list. 
                        This option requires you to have previously configured the
                            connection to the HDFS installation to be used, as described in the
                                documentation
                            for the tHDFSConfiguration
                            component. 
                        If you leave the Define a
                                storage configuration component check box unselected,
                            you can only convert files locally. 
                     | 
            
               | 
                         
                            Configure Component
                         
                     | 
               
                         To configure the component, click the [...] button and, in the Component Configuration window, perform
                            the following actions. 
                        
                     - 
                                
Click the Select button next to the Record Map field and then, in
                                    the Select a Map dialog box
                                    that opens, select the map you want to use and then click
                                        OK. 
                                This map must have been previously created in 
                                        Talend Data Mapper
                                    . 
                                Note that the input and output representations
                                    are those defined in the map, and cannot be changed in the
                                    component. 
                             
                     - 
                                
Click Next. 
                             
                     - 
                                
Tell the component where each new record begins.
                                    In order for you to be able to do so, you need to fully
                                    understand the structure of your data. 
                                Exactly how you do this varies depending on the
                                    input representation being used, and you will be presented with
                                    one of the following options. 
                                
                           - 
                                        
Select an appropriate
                                            record delimiter for your data. Note that you must
                                            specify this value without quotes.  
                                    - 
                                                  
                                                  Separator lets you specify a
                                                  separator indicator, such as \n, to identify a new
                                                  line. 
                                                  Supported
                                                  indicators are \n for a Unix-type new line, \r\n for
                                                  Windows and \r for Mac, and \t for tab characters. 
                                                 
                                    - 
                                                  
                                                  Start/End with lets you specify the
                                                  initial characters that indicate a new record,
                                                  such as <root, or the characters that indicate
                                                  where a record ends. This can also be a regular
                                                  expression. 
                                                  
                                                  Start
                                                  with also supports new lines, \n for a
                                                  Unix-type new line, \r\n for Windows and \r for Mac,
                                                  and \t for
                                                  tab characters. 
                                                 
                                    - 
                                                  Sample
                                                  File: To test the signature with a
                                                  sample file, click the [...] button, browse to the file you
                                                  want to use as a sample, click Open, and then click
                                                  Run to test
                                                  your sample.
Testing the
                                                  signature lets you check that the total number of
                                                  records and their minimum and maximum length
                                                  corresponds to what you expect based on your
                                                  knowledge of the data. This step assumes you have
                                                  a local subset of your data to use as a
                                                  sample. 
                                                 
                                  
                                 
                                         
                                     
                           - 
                                        
If your input representation is COBOL or
                                            Flat with positional and/or binary encoding properties,
                                            define the signature for the input record structure: 
                                    - 
                                                  Input Record root
                                                  corresponds to the root element in your input
                                                  record.
 
                                    - 
                                                  
                                                  Minimum Record
                                                  Size corresponds to the size in bytes
                                                  of the smallest record. If you set this value too
                                                  low, you may encounter performance issues, since
                                                  the component will perform more checks than
                                                  necessary when looking for a new record. 
                                                 
                                    - 
                                                  
                                                  Maximum Record
                                                  Size corresponds to the size in bytes
                                                  of the largest record, and is used to determine
                                                  how much memory is allocated to read the
                                                  input. 
                                                 
                                    - 
                                                  Sample from Workspace or
                                                  Sample from File System: To
                                                  test the signature with a sample file, click the
                                                  [...]
                                                  button, and then browse to the file you want to
                                                  use as a sample.
Testing the signature lets you
                                                  check that the total number of records and their
                                                  minimum and maximum length corresponds to what you
                                                  expect based on your knowledge of the data. This
                                                  step assumes you have a local subset of your data
                                                  to use as a sample. 
                                                 
                                    - 
                                                  
                                                  Footer Size
                                                  corresponds to the size in bytes of the footer, if
                                                  any. At runtime, the footer will be ignored rather
                                                  than being mistakenly included in the last record.
                                                  Leave this field empty if there is no footer. 
                                                 
                                    - 
                                                  
Click the Next button to open
                                                  the Signature
                                                  Parameters window, select the fields
                                                  that define the signature of your record input
                                                  structure (that is, to identify where a new record
                                                  begins), update the Operation and Value columns as
                                                  appropriate, and then click Next. 
                                                 
                                    - 
                                                  
In the Record
                                                  Signature Test window that opens, check
                                                  that your Records are correctly delineated by
                                                  scrolling through them with the
                                                  Back and
                                                  Next buttons and performing
                                                  a visual check, and then click
                                                  Finish. 
                                                 
                                  
                                         
                                     
                         
                             
                   
                     | 
            
               | 
                         
                            Die on error
                         
                     | 
               
                         This check box is selected by default. 
                        Clear the check box to skip any rows on error and complete the
            process for error-free rows. 
                        If you opt to clear the check box, you can perform any of these options: 
                        - 
                                
Connect the tHMapFile component to an output
                                    component, for example tAvroOutput, using a connection. In the output component, ensure
                                    that you add a fixed metadata with the following columns: 
                                 - inputRecord: contains the rejected
                                            input record during the transformation.
 
                                 - recordId: refers to the record
                                            identifier. For a text or binary input, the recordId
                                            specifies the start offset of the record in the
                                            input file. For an AVRO input, the recordId
                                            specifies the timestamp when the input was
                                            processed.
 
                                 - errorMessage: contains the
                                            transformation status with details of the cause of
                                            the transformation error.
 
                               
                            
                             
                        - 
                                
If the check box is unselected, you can retrieve the rejected
                                    records in a file. One of these mechanisms triggers this
                                    feature: (1) a context variable
                                    (talend_transform_reject_file_path)
                                    and (2) a system variable set in the Advanced job parameters
                                    (spark.hadoop.talend.transform.reject.file.path). 
                                When you set the file path on the Hadoop Distributed File
                                    System (HDFS), no further configurations are needed. When
                                    you set the file on Amazon S3 or any other Hadoop-compatible
                                    file systems, add the associated Spark advanced
                                    configuration parameter. 
                                In case of errors at runtime, tHMapFile
                                    checks if one of the mechanisms exists and, if so, appends
                                    the rejected record to the designated file. The reject file
                                    content includes the concatenation of the rejected records
                                    without any additional metadata. 
                                If the file system you use does not support appending to a
                                    file, a separate file is created for each rejection. The
                                    file uses the provided file path as the prefix and adds a
                                    suffix that is the offset of the input file and the size of
                                    the rejected record. 
                             
                      
                   
                        Information noteNote: Any errors while trying to store the reject are logged and the
                            processing continues. 
                     |