Skip to main content Skip to complementary content

Using Oracle as a vector database

The following Job aims at connecting to an Oracle Database 23ai, creating a table for similarity search, inserting some data about famous statements into it, and performing similarity search by vector type.

Before you begin

This scenario uses Oracle AI vector search capabilities. For more information, read the corresponding Oracle documentation.

Linking the components

Procedure

  1. Drag and drop the following components from the Palette: tPrejob, tOracleConnection, tFileInputDelimited, 2 tLogRow, tMap, tOracleOutput, tOracleInput, tPostjob, and tOracleClose.
  2. Link the tPrejob to the tOracleConnection and the tPostjob to the tOracleClose using an OnComponentOk connection as no data is handled here.
  3. Connect the other components using a Row > Main connection.
  4. Connect the main Job to the sub-Job using the trigger link OnSubjobOk.
    Overview of the Job in the Studio.

Configuring the components

Procedure

  1. Double-click the tOracleConnection component to display its Component view.
  2. Configure the connection to your Oracle 23ai database.
    tOracleConnection configuration.
  3. Double-click the tFileInputDelimited component to display its Component view.
  4. Set the path and properties of the sample file you want to insert in the Oracle database.
    tFileInputDelimited configuration
    In this scenario, a TXT sample of famous quotes, with authors, years, and countries, is used.
    File showing some famous quotes, authors, years, and countries.
  5. Double-click the first tLogRow component to display its Component view.
    Click Sync columns to retrieve the schema structure from the previous component if needed.
    In the Mode area, select Table, and click Print content with log4j to display the Job result in the console.
  6. Double-click the tMap component to open the Map Editor.
  7. Drag the following columns from the input table to the out1 table: id, statement, author, year, and country.
    Add a column statement_clone, to the end of the out1 table, and drag the statement column from the input table to this one. This new column will be used as the reference column in the Oracle vector table.
    tMap configuration.
  8. Click OK to validate the changes and close the editor.
  9. Double-click the tOracleOutput component to display its Component view.
  10. In the Basic settings view, enter the name of the Oracle table that will be used as a vector table (in this use case it is called vectable).
    Select Drop table if exists and create in the Action on table list, and select Insert in the Action on data list.
    For more information on vector tables, read the corresponding Oracle documentation.
    Basic settings view of the tOracleOutput configuration.
  11. In the Advanced settings view, add a line in the Additional columns table to enter the SQL expression allowing you to insert the sample data into the selected vector column.
    Set the Data type to VECTOR, select statement_clone as the Reference column, select Replace in the Position list, and enter the following SQL expression:
    "TO_VECTOR(VECTOR_EMBEDDING(ALL_MINILM_L12_V2 USING ? as data))"
    Advanced settings view of the tOracleOutput configuration.
  12. Double-click the tOracleInput component to display its Component view.
  13. In the Basic settings view, enter the following query in the Query field to fetch the sample data from the vector table previously created:
    "
    SELECT statement 
    FROM vectable 
    ORDER BY VECTOR_DISTANCE(vec, TO_VECTOR(VECTOR_EMBEDDING(ALL_MINILM_L12_V2 USING '" + context.query + "' as data)), EUCLIDEAN_SQUARED)
    FETCH FIRST 5 ROWS ONLY
    "
  14. In the Advanced settings view, add a column named result in the Trim column table. This column will be displayed in the console when you execute the Job.
  15. Double-click the second tLogRow component to display its Component view.
    Click Sync columns to retrieve the schema structure from the previous component if needed.
    In the Mode area, select Table, and click Print content with log4j to display the Job result in the console.

Executing the Job

Procedure

  1. Press Ctrl + S to save your Job.
  2. Press F6 to execute it.

Results

The Run console displays the result of the SQL query that fetches the sample data from the vector table. Only the first 5 rows of the query result are displayed.

Once the vector table is created, you can do some similarity search on the vector table. For example, with the query "Matter of work", you will get the following result:
.-----------------------------------------------------.
|                      tLogRow_2                      |
|=---------------------------------------------------=|
|result                                               |
|=---------------------------------------------------=|
|The only way to do great work is to love what you do.|
|The only way to do great work is to love what you do.|
|Knowledge is power.                                  |
|To do nothing is the way to be nothing.              |
|With great power comes great responsibility.         |
'-----------------------------------------------------'

For more information on how to perform vector search with Oracle, read the corresponding Oracle documentation.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!