Skip to main content Skip to complementary content

tExtractXMLField Standard properties

These properties are used to configure tExtractXMLField running in the Standard Job framework.

The Standard tExtractXMLField component belongs to the Processing and the XML families.

The component in this framework is available in all Talend products.

Basic settings

Property type

Either Built-In or Repository.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the Repository Content window.

 

Built-In: No property data stored centrally.

 

Repository: Select the repository file where the properties are stored.

When this file is selected, the fields that follow are pre-filled in using fetched data.

Schema type and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. When you create a Spark Job, avoid the reserved word line when naming the fields.

 

Built-In: You create and store the schema locally for this component only.

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs.

XML field

Name of the XML field to be processed.

Related topic: see Talend Studio User Guide.

Loop XPath query

Node of the XML tree, which the loop is based on.

Mapping

Column: reflects the schema as defined by the Schema type field.

XPath Query: Enter the fields to be extracted from the structured input.

Get nodes: Select this check box to recuperate the XML content of all current nodes specified in the Xpath query list or select the check box next to specific XML nodes to recuperate only the content of the selected nodes.

Limit

Maximum number of rows to be processed. If Limit is 0, no rows are read or processed.

Die on error

Select the check box to stop the execution of the Job when an error occurs.

Clear the check box to skip any rows on error and complete the process for error-free rows. When errors are skipped, you can collect the rows on error using a Row > Reject link.

Advanced settings

Ignore the namespaces

Select this check box to ignore namespaces when reading and extracting the XML data.

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usage rule

This component is an intermediate component. It needs an input and an output components.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!