The LOAD statement loads fields from a file, from data defined in the script, from a previously loaded table, from a web page, from the result of a subsequent SELECT statement or by generating data automatically. It is also possible to load data from analytic connections.
LOAD [ distinct ] fieldlist
[( from file [ format-spec ] |
from_field fieldassource [format-spec]|
inline data [ format-spec ] |
resident table-label |
autogenerate size ) |extension pluginname.functionname([script] tabledescription)]
[ where criterion | while criterion ]
[order by orderbyfieldlist ]
If no source of data is given by means of a from, inline, resident, from_field, extension or autogenerate clause, data will be loaded from the result of the immediately succeeding SELECT or LOAD statement. The succeeding statement should not have a prefix.
Load only three specific fields from a delimited file:
LOAD FirstName, LastName, Number from [lib://DataFiles/data1.csv];
Rename first field as A and second field as B when loading a file without labels:
LOAD @1 as A, @2 as B from [lib://DataFiles/data3.txt] (ansi, txt, delimiter is '\t', no labels);
Load Name as a concatenation of FirstName, a space character, and LastName:
LOAD FirstName&' '&LastName as Name from [lib://DataFiles/data1.csv];
Load Quantity, Price and Value (the product of Quantity and Price):
LOAD Quantity, Price, Quantity*Price as Value from [lib://DataFiles/data1.csv];
Load only unique records, duplicate records will be discarded:
LOAD distinct FirstName, LastName, Number from [lib://DataFiles/data1.csv];
Load only records where the field Litres has a value above zero:
LOAD * from [lib://DataFiles/Consumption.csv] where Litres>0;
Load a table with inline data, two fields named CatID and Category:
LOAD * Inline
Load a table with inline data, three fields named UserID, Password and Access:
LOAD * Inline [UserID, Password, Access
A, ABC456, User
B, VIP789, Admin];
Load a table with 10 000 rows. Field A will contain the number of the read record (1,2,3,4,5...) and field B will contain a random number between 0 and 1:
LOAD RecNo( ) as A, rand( ) as B autogenerate(10000);
First we load a delimited table file and name it tab1:
SELECT A,B,C,D from [lib://DataFiles/data1.csv];
Load fields from the already loaded tab1 table as tab2:
LOAD A,B,month(C),A*B+D as E resident tab1;
Load fields from already loaded table tab1 but only records where A is larger than B:
LOAD A,A+B+C resident tab1 where A>B;
Load fields from already loaded table tab1 ordered by A:
LOAD A,B*C as E resident tab1 order by A;
Load fields from already loaded table tab1, ordered by the first field, then the second field:
LOAD A,B*C as E resident tab1 order by 1,2;
Load fields from already loaded table tab1 ordered by C descending, then B in ascending order, and then the first field in descending order:
LOAD A,B*C as E resident tab1 order by C desc, B asc, 1 desc;
Load field Types from previously loaded table Characters as A:
LOAD A from_field (Characters, Types);
Load A, B and calculated fields X and Y from Table1 that is loaded in succeeding SELECT statement:
LOAD A, B, if(C>0,'positive','negative') as X, weekday(D) as Y;
SELECT A,B,C,D from Table1;
Load fields grouped (aggregated) by ArtNo:
LOAD ArtNo, round(Sum(TransAmount),0.05) as ArtNoTotal from table.csv group by ArtNo;
Load fields grouped (aggregated) by Week and ArtNo:
LOAD Week, ArtNo, round(Avg(TransAmount),0.05) as WeekArtNoAverages from table.csv group by Week, ArtNo;
In this example we have a input file Grades.csv containing the grades for each student condensed in one field:
The grades, in a 1-5 scale, represent subjects Math, English, Science and History. We can separate the grades into separate values by reading each record several times with a while clause, using the IterNo( ) function as a counter. In each read, the grade is extracted with the Mid function and stored in Grade, and the subject is selected using the pick function and stored in Subject. The final while clause contains the test to check if all grades have been read (four per student in this case), which means next student record should be read.
mid(Grades,IterNo( ),1) as Grade,
pick(IterNo( ), 'Math', 'English', 'Science', 'History') as Subject from [lib://DataFiles/Grades.csv]
The result is a table containing this data:
The following sample data is used.
Loading data using a function
In these examples, we assume that we have an analytic connection plugin named P that contains a custom function Calculate(Parameter1, Parameter2). The function returns the table Results that contains the fields Field1 and Field2.
Load all fields that are returned when sending the fields A and C to the function.
Load only the Field1 field when sending the fields A and C to the function.
Load all fields that are returned when sending the fields A and B to the function. As fields are not specified, A and B are used as they are the first in order in the table.
Load all fields that are returned when sending the field C to both parameters of the function.
Load all fields that are returned when sending the field A forced as a string and B forced as a numeric to the function.
Loading data by evaluating a script
Load the table returned by the script q when sending the values of A and B.
Load the table returned by the script stored in the My_R_Script variable when sending the values of A and B.
Load the table returned by the script stored in the My_R_Script variable when sending the values of B renamed to D, A and C. Using * sends the remaining unreferenced fields.