NextGen XML: Overview
NextGen XML identifies and defines primary and foreign keys through parsing the hierarchical data of source XMLs without the need for accompanying XSDs. NextGen XML provides this parsing support by leveraging the Qlik Sense XML API connector (specifically the XML parser). The connector takes the root element of an XML response as the parent data table. Note that the XML response must contain at least one child element under the root element.
Features and behavior of NextGen XML processing:
- The entire XML is scanned to find all entities and fields
- Any namespaces will be ignored
- All XMLs must have the same root element; multiple XML fragments will be read together if one root element is shared
The following conditions will trigger generation of an entity:
- A simple element directly beneath the root element will become a table. A simple XML element doesn't have any attribute, empty or self-closed element, or contains only text content. For example: <element/>, <element></element>, <element>some content</element>.
- All attributes and simple elements are fields of a table (parent element) with the following exceptions:
- If an element contains some attribute or nested elements, it becomes a table. Duplicate elements within a child element will create a new table. If some element (<order>) has several nested simple elements with the same name (<comment>) then field comment will be converted to a table named comment with field @Content. For example, consider the following XML example where an element (<book>) has multiple duplicate (<author>) sub-elements.
This XML will result in two entities. One called book with two columns: title and z__KEY_book (primary key) and one called author with two columns: @Content with loaded values Linus van Pelt and Sally Brown and the second column z__FK_author populates with a foreign key value, not the same value as the root primary key z__KEY_book. If this is not desired, the XML should be edited with unique numbered sub-elements (<author1></author1>, <author2></author2>.
- If a (not simple) element contains nested content then field value for (@Content) will be added into the table (Note). See the nested table (pictured below) and rendered in a table.
- Post process: If a table (Orders) contains both a field and nested table with the same name (note), then this field will be removed and the field z_Content value will be added into the nested table.
Nested table fields: subject, z_Content, z__FK_Note
subject | z_Content | z__FK_Note |
---|---|---|
pack | Some message2 | lufhnLInPWX7ruk4wbTnkiYoUwQ= |
some message | TP0kssKDa3liKp2MQ1zapXIJk+Q= |
Repeating Elements
When two unique elements of the same name appear they are appended with an underscore and a number. If entity names are not unique, the first duplicate will be appended with "_2", the third duplicate element with "_3" and so on. Users can also navigate to entity property xml.entity.xpath to view where the element appears in the xml hierarchy.
Primary and foreign keys
The XML parser identifies the root element of an XML response, defining that as the primary key (z__KEY_<RootElementName>). That same key becomes a field in the child entity tables as (z__FK_<ChildElementName >). More complicated XML parent/child relations will render as primary/foreign keys based on the hierarchy found. The relationships between parent/child entities are captured and retained by foreign and primary keys rendered as columns in the resulting entity tables. NextGen XML identifies and includes these keys as fields in the entities. Identification of these keys in the tables is valuable for downstream processes such as transforms and joins, analyses and select statements, shopping for data, and publishing related datasets.
An example of primary key will look like "__KEY_breakfast_menu" and a foreign key (relative to primary) will look like "__FK_food" in a different table.