Skip to main content

Alternative data types

Data Shaping Query Language allows you to handle multiple data types for the same element, using an Avro schema.

For more information about Avro schemas, see the Oracle documentation.
Information noteNote: This feature is only available in Talend Data Mapper.
In the following sample JSON input, the element order_amount can be either:
  • An integer
  • A decimal
  • A string
  • A record containing two decimals, items_total and tax_amount
{
    "orders": [
        {
            "order_id": "abc-12345",
            "order_amount": 250
        },
        {
            "order_id": "def-67890",
            "order_amount": "299.99"
        },
        {
            "order_id": "ghi-54321",
            "order_amount": 435.95
        },
        {
            "order_id": "jkl-09876",
            "order_amount": {
                "items_total": 395.99,
                "tax_amount": 11.89
            }
        }
    ]
}
This input is described by the following Avro schema:
{
    "type": "record",
    "name": "root",
    "fields": [
        {
            "name": "orders",
            "type": {
                "type": "array",
                "items": {
                    "type": "record",
                    "name": "order",
                    "fields": [
                        {
                            "name": "order_id",
                            "type": "string"
                        },
                        {
                            "name": "order_amount",
                            "type": [
                                "int",
                                {
                                    "type": "bytes",
                                    "logicalType": "decimal"
                                },
                                "string",
                                {
                                    "type": "record",
                                    "name": "amount",
                                    "fields": [
                                        {
                                            "name": "items_total",
                                            "type": {
                                                "type": "bytes",
                                                "logicalType": "decimal"
                                            }
                                        },
                                        {
                                            "name": "tax_amount",
                                            "type": {
                                                "type": "bytes",
                                                "logicalType": "decimal"
                                            }
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                }
            }
        }
    ]
}

Data Shaping Query Language allows you reference each possible type by appending the type's index to the element's identifier using the following syntax: identifier.@index . In this example, int is the first value in the list of possible types for the order_amount element, so it can be referenced with order_amount.@0.

This syntax can also be used in a hierarchical identifier. For example, you can reference the tax_amount element in the record using order_amount.@3.tax_amount.

You can identify the type used for each iteration or an element using the function resolveChoice. For more information about this function, see Special functions.

In this example, you can use the following query to get the data type used for the order_amount element in each item in the orders array:
FROM orders
SELECT {
    order_id,
    amount = resolveChoice(order_amount)
}
This query returns the following result:
[
    {
        "order_id": "abc-12345",
        "amount_type": 0
    },
    {
        "order_id": "def-67890",
        "amount_type": 2
    },
    {
        "order_id": "ghi-54321",
        "amount_type": 1
    },
    {
        "order_id": "jkl-09876",
        "amount_type": 3
    }
]
You can then write a query to harmonize your data and return a single data type for the order_amount element. For example, you can use the query below to:
  • Assign a type to each alternative index using a switch expression and the resolveChoice function.
  • Convert the string and integer values to decimals using the toDecimal function.
  • Calculate the total amount for each record by adding tax_amount to items_total.
FROM orders
LET $amount_type = 
    SWITCH (resolveChoice(order_amount)) { 
		CASE 0: 'INT' 
		CASE 1: 'DECIMAL' 
		CASE 2: 'STRING' 
		CASE 3: 'RECORD' 
	}
SELECT {
    order_id,
    amount = 
        IF ($amount_type = "STRING" OR $amount_type = "INT") 
            toDecimal(order_amount)
        ELSEIF ($amount_type = "RECORD")
            order_amount.@3.items_total + order_amount.@3.tax_amount
        ELSE order_amount
}
With this query, each iteration of the order_amount is converted to a decimal value, regardless of the input type. It returns the following result:
[
    {
        "order_id": "abc-12345",
        "amount": 250
    },
    {
        "order_id": "def-67890",
        "amount": 299.99
    },
    {
        "order_id": "ghi-54321",
        "amount": 435.95
    },
    {
        "order_id": "jkl-09876",
        "amount": 407.88
    }
]

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!