| 
                         
                            Storage
                         
                     | 
               
                         To connect to an HDFS installation, select the Define a storage configuration component
                            check box and then select the name of the component to use from those
                            available in the drop-down list. 
                        This option requires you to have previously configured the
                            connection to the HDFS installation to be used, as described in the
                            documentation for the tHDFSConfiguration component. 
                        If you leave the Define a
                                storage configuration component check box unselected,
                            you can only convert files locally. 
                     | 
            
               | 
                         
                            Configure Component
                         
                     | 
               
                         Before you configure this component, you must have already
                            added a downstream component and linked it to the tHMapInput component, and retreived the
                            schema from the downstream component. 
                        To configure the component, click the  [...] button and, in the  Component Configuration window, perform
                            the following actions. 
                        - 
                                    
Click the Select button next to the Record structure field and
                                        then, in the Select a
                                            Structure dialog box that opens, select the
                                        map you want to use and then click OK. 
                                    This structure must have been previously
                                        created in 
                                            Talend Data Mapper
                                        . 
                                 
                        - 
                                    
Select the Input
                                            Representation to use from the drop-down
                                        list. 
                                    Supported input formats are Avro, COBOL, EDI,
                                        Flat, IDocs, JSON and XML. 
                                 
                        - 
                                    
Click Next. 
                                 
                        - 
                                    
Tell the component where each new record
                                        begins. In order for you to be able to do so, you need to
                                        fully understand the structure of your data. 
                                    Exactly how you do this varies depending on
                                        the input representation being used, and you will be
                                        presented with one of the following options. 
                                    
                              - 
                                            
Select an appropriate record
                                                delimiter for your data. Note that you must specify
                                                this value without quotes.  
                                            
                                    - 
                                                  
                                                  Separator
                                                  lets you specify a separator indicator, such as \n, to identify a new
                                                  line. 
                                                  Supported indicators are \n for a Unix-type new line,
                                                  \r\n for Windows and \r for Mac, and \t for tab characters. 
                                                 
                                    - 
                                                  
                                                  Start/End
                                                  with lets you specify the initial
                                                  characters that indicate a new record, such as <root, or the characters
                                                  that indicate where a record ends. 
                                                  
                                                  Start with
                                                  also supports new lines, \n
                                                  for a Unix-type new line, \r\n for Windows and \r for Mac, and \t for
                                                  tab characters. 
                                                  Select the Regular
                                                  Expression check box if you to wish to
                                                  enter a regular expression to match the start of a
                                                  record. When you select XML or JSON, this check
                                                  box is selected by default and a pre-configured
                                                  regular expression is provided. 
                                                 
                                    - 
                                                  Sample File: To test the
                                                  signature with a sample file, click the
                                                  [...] button, browse to the
                                                  file you want to use as a sample, click
                                                  Open, and then click
                                                  Run to test your
                                                  sample.
Testing the signature lets you check
                                                  that the total number of records and their minimum
                                                  and maximum length corresponds to what you expect
                                                  based on your knowledge of the data. This step
                                                  assumes you have a local subset of your data to
                                                  use as a sample. 
                                                 
                                    - 
                                                  
Click Finish. 
                                                 
                                  
                                         
                              - 
                                            
If your input representation is COBOL
                                                or Flat with positional and/or binary encoding
                                                properties, define the signature for the input
                                                record structure: 
                                       - 
                                                  Input Record root
                                                  corresponds to the root element in your input
                                                  record.
 
                                       - 
                                                  
                                                  Minimum Record
                                                  Size corresponds to the size in bytes
                                                  of the smallest record. If you set this value too
                                                  low, you may encounter performance issues, since
                                                  the component will perform more checks than
                                                  necessary when looking for a new record. 
                                                   
                                       - 
                                                  
                                                  Maximum Record
                                                  Size corresponds to the size in bytes
                                                  of the largest record, and is used to determine
                                                  how much memory is allocated to read the
                                                  input. 
                                                   
                                       - 
                                                  Sample from Workspace or
                                                  Sample from File System: To
                                                  test the signature with a sample file, click the
                                                  [...]
                                                  button, and then browse to the file you want to
                                                  use as a sample.
Testing the signature lets you
                                                  check that the total number of records and their
                                                  minimum and maximum length corresponds to what you
                                                  expect based on your knowledge of the data. This
                                                  step assumes you have a local subset of your data
                                                  to use as a sample. 
                                                   
                                       - 
                                                  
                                                  Footer Size
                                                  corresponds to the size in bytes of the footer, if
                                                  any. At runtime, the footer will be ignored rather
                                                  than being mistakenly included in the last record.
                                                  Leave this field empty if there is no footer. 
                                                   
                                       - 
                                                  
Click the Next button to open
                                                  the Signature
                                                  Parameters window, select the fields
                                                  that define the signature of your record input
                                                  structure (that is, to identify where a new record
                                                  begins), update the Operation and Value columns as
                                                  appropriate, and then click Next. 
                                                   
                                       - 
                                                  
In the Record
                                                  Signature Test window that opens, check
                                                  that your Records are correctly delineated by
                                                  scrolling through them with the
                                                  Back and
                                                  Next buttons and performing
                                                  a visual check, and then click
                                                  Finish. 
                                                   
                                     
                                             
                                         
                            
                                 
                        - 
                                    
Map the elements from the input structure to
                                        the output structure in the new map that opens, and then
                                        press Ctrl+S to save
                                        your map. 
                                    For more information on creating maps, see 
                                            Talend Data Mapper User Guide. 
                                 
                      
                         
                     | 
            
               | 
                         
                            Die on error
                         
                     | 
               
                   This check box is selected by default. 
                  Clear the check box to skip any rows on error and complete the
            process for error-free rows. 
If you opt to clear the
                        check box, you can perform any of these options:
                     - 
                                
Connect the tHMapInput component to an output
                                    component, for example tAvroOutput, using a  connection. In the output component, ensure that
                                    you add a fixed metadata with the following columns: 
                              - inputRecord: contains the rejected
                                            input record during the transformation.
 
                              - recordId: refers to the record
                                            identifier. For a text or binary input, the recordId
                                            specifies the start offset of the record in the input
                                            file. For an AVRO input, the recordId specifies the
                                            timestamp when the input was processed.
 
                              - errorMessage: contains the
                                            transformation status with details of the cause of the
                                            transformation error.
 
                            
                         
                             
                     - 
                                
You can retrieve the rejected records in a file.
                                    One of these mechanisms triggers this feature: (1) a context
                                    variable (talend_transform_reject_file_path) and (2) a
                                    system variable set in the Advanced job parameters (spark.hadoop.talend.transform.reject.file.path). 
                                When you set the file path on the Hadoop
                                    Distributed File System (HDFS), no further configurations are
                                    needed. When you set the file on Amazon S3 or any other
                                    Hadoop-compatible file systems, add the associated Spark
                                    advanced configuration parameter. 
                                In case of errors at runtime, tHMapFile checks if one of the
                                    mechanisms exists and, if so, appends the rejected record to the
                                    designated file. The reject file content includes the
                                    concatenation of the rejected records without any additional
                                    metadata. 
                                If the file system you use does not support
                                    appending to a file, a separate file is created for each
                                    rejection. The file uses the provided file path as the prefix
                                    and adds a suffix that is the offset of the input file and the
                                    size of the rejected record. 
                             
                   
                  Information noteNote: Any errors while trying to store the reject are logged and the
                            processing continues. 
                |