Additional information about conversions, types, and formats
The Type converter processor allows you to apply multiple conversion operations to an incoming record.
You can either convert Primitive data types or Semantic data types.
Source and destination types
-
Primitive types: null, boolean, int, long, float, double, bytes, string. They also include:
-
Complex types: record, enum, array, map, union, fixed
-
Logical types: date:int, time-millis:int, time-micros:long, timestamp-millis:long, timestamp-micros:long, duration:fixed(12), decimal:fixed|bytes
-
- Semantic types: they are predefined semantic types suggested by Talend Cloud when retrieving the fields of a dataset. For more information, read Managing semantic types.
Errors and warnings
-
Parsing exceptions with bad DateFormat/DecimalFormat patterns.
-
Any source causes exception from bad parse or valueOf conversions.
-
Not enough source bytes to create a destination value.
Date-oriented formats (for primitive types only)
When either the source or destination value is a date/time-oriented value AND the other is a string, the format is used in the conversion, as described in the DateTimeFormatter documentation. If no format is present, the default ISO 8601 format provided with Java is used.
DateTime includes both calendar day and time information.
Format |
String |
---|---|
EEE, MMM d, ''yy 'at' h:mm a |
Tue, Nov 28, '17 at 12:44 PM |
yyyyy.MMMM.dd GGG hh:mm a |
02017.November.28 AD 12:44 PM |
-
No field smaller than a day should appear in a Date format. There is no "hour" in the Date type: yyyy-MM-dd
-
No field larger than an hour should appear in a Time format. There is no "day" in the Time format: HH:mm:ss.SSS
Number formats (for primitive types only)
Format |
String |
---|---|
'#'# |
#1, #12345, #-123 |
$#,##0.00;($#,##0.00) |
$1,234.56, $0.50, ($1.00), ($1,234.56) |
Some logical rules apply to the conversions:
- Integer and Long formats that include a decimal point will cause an error, for example.
Examples
-
: Widening primitive conversions where no information is lost.
-
: Primitive conversions (widening or narrowing) where information might be lost.
-
: The DateFormat pattern, if present, is used for String conversions with date/time types.
-
If the source is a logical type date, time-millis, timestamp-millis (time-micros and timestamp-micros are treated as long), or the destination is Date, Time, or DateTime.
-
If no pattern is present, Date/Time/DateTime types use specific ISO-8601 patterns.
- Date: yyyy-MM-dd
- Time: HH:mm:ss
- DateTime: yyyy-MM-dd'T'HH:mm:ss'Z'
-
-
: The DecimalFormat pattern, if present, is used for String conversions with numeric types. If not present, fall back to Integer.valueOf() or Integer.toString() (with the appropriate destination value).
-
When converting between supported date-oriented types and numbers, the format isn't used.
-
Date: the incoming/outgoing number is the number of days since 1970-01-01 (int)
-
Time: the incoming/outgoing number is the number of milliseconds since 00:00:00 (int)
-
DateTime: the incoming/outgoing number is the number of milliseconds since 1970-01-01 00:00:00 (long)
-
-
When the source and destination are supported date-oriented types and numbers, the date and time components are kept consistent between the two. Anything unknown is set relative to 1970-01-01 00:00:00. For example, converting a Time (with no date component) to Date will always return 1970-01-01.
For more information, see the Oracle documentation.
Source type (Avro) |
Source value |
Format |
Destination type |
Destination value |
---|---|---|---|---|
int |
12345 |
- |
Long |
12345L (widening conversion does not lose anything) |
long |
12345L |
- |
Integer |
12345 (narrowing conversions can be OK, usually on data with few significant digits) |
long |
1234567890123456789L |
- |
Integer |
2112454933 (narrowing conversions can lose data, but in a well-defined way. In this case, the last four bytes of the long were reinterpreted as an int) |
long |
1234567890123456789L |
- |
Double |
1234567890123456770.0d (some widening conversions can lose precision in a well-defined way) |
long |
0x8000000000000000L(MIN_VALUE) |
- |
Integer |
0 (narrowing conversion uses the last four bytes) |
string |
"1234.5" |
- |
Integer |
Error -- Cannot parse floating point without a format. |
string |
"1234.5" |
# |
Integer |
1234 (the format discards after the decimal point) |
string |
"1234.5" |
#.# |
Integer |
1234 (even a format with a decimal point helps convert the input string into a number) |
boolean |
false |
- |
Integer |
0 |
boolean |
true |
- |
Integer |
1 |
boolean |
false |
- |
Date |
1970-01-01 (zero days since 1970-01-01) |
boolean |
true |
- |
Date |
1970-01-02 (one day since 1970-01-01) |
boolean |
false |
- |
Time |
00:00:00.000 (zero milliseconds since midnight) |
boolean |
true |
- |
Time |
00:00:00.001 (one milliseconds since midnight, note that if your view does not show milliseconds, this will look exactly like false even though the underlying data is different) |
timestamp-millis |
2017-11-28T12:44:22Z |
yyyyMMdd |
String |
20171128
Information noteNote: The conversion timestamp-millis >
String does not work on Test
datasets.
|
String |
20171128 |
yyyyMMdd |
timestamp-millis |
2017-11-28T00:00:00Z (hours, minutes and seconds are 0) |
String |
"20171128" |
yyyyMMdd |
Date |
2017-11-28 |
int |
20171128 |
- |
Date |
+57196-09-03 (20,171,128 days after 1970-01-01) |
time-millis |
12:44:22 |
- |
DateTime |
1970-01-01T12:44:22Z(since there is no date part in the source time, 1970-01-01 is used) |
timestamp-millis |
2017-11-28T12:44:22Z |
- |
Date |
2017-11-28 (the time component is removed, the underlying number is changed from 1511873062123L to 17498) |