Skip to main content Skip to complementary content

Harvesting several models from a directory of external metadata files

Availability-noteAWS

It is common for an organization to have a large number of external metadata files without using an external metadata repository. You can import the files into the data catalog application in batch automatically.

In this case, the files are stored under a file directory which is accessible to Talend Data Catalog application server. The script scans the directory and its sub-directories for files of the particular external metadata type and finds matching models under a particular folder in the application. Talend Data Catalog repository folder and model structure will match the structure of files and their directories on the file system. When the needed model does not exist, the script creates the model and imports the file. When the content is present the script will re-import it if the file’s version has not been harvested yet.

You can schedule Talend Data Catalog to run the script periodically so that after you place files under the directory, Talend Data Catalog will import them automatically. It works for any single model file based bridges.

A special model named Settings must also be defined in order to control how the files will be imported (what source tool and what bridge parameters).

You have been assigned an object role with the Metadata Management capability.

Creating the configuration as needed

You may create, harvest and analyze metadata (model contents) from the MANAGE > Repository tree. However, generally it is best to create a configuration to work in (and to define the scope of analysis and search) and then go to MANAGE > Configuration to create and harvest there.

Creating a folder

  1. Go to MANAGE > Repository.
  2. Right-click a folder in the repository tree where you want to place the folder containing the results of the import and click New > Folder.
  3. Name the folder as needed.

Creating the Settings file to control the import of models

  1. Right-click the new folder in the repository tree and click New > Model.
  2. Enter Settings in the Name field.

  3. Select the import server in the Import Server drop-down list.

  4. Select the correct source format in the Import Bridge drop-down list.
  5. Click OK.

  6. Click the Import Setup tab.

  7. Click the row of each parameter and fill in the information according to the documentation displayed in the Help panel. In particular, for the File parameter, click the browse icon to navigate to a file inside the directory structure on the file system.

  8. Update the File parameter so that the path only refers to the top level of the directory structure on the file system (for example, remove the file name and any sub-directory names).

  9. Click the Import Options tab.

  10. Select the Set new versions as default check box if you want to automatically set any new imported version as the default version.

Harvesting the models on demand

  1. Right-click the new folder in the repository panel and select Operations > Import new model(s) from folder.

  2. Click on the Run Operation button.

  3. To monitor the operation, click the spinning gear icon on the header or go to MANAGE > Operations.
  4. To open the log, right-click the operation then click Show log.
    • If you receive the Operation succeeded result, you can open the model.
    • If you receive the Operation failed result, inspect the log messages and correct the source model file accordingly.
  5. Be sure to include those new models in the configuration.

    You may now browse the models.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!