Harvesting metadata
You harvest metadata by using Talend Data Catalog bridges.
Talend Data Catalog | Standard | Advanced | Advanced Plus |
---|---|---|---|
Harvesting from any supported data store technologies | |||
Harvesting from any supported Data Model tools | |||
Data Integration with DI, ETL and ELT tools | |||
Harvesting from Talend Data Integration, Talend MDM and Talend Data Preparation | |||
Harvesting from any supported Data Integration tools | |||
Data Integration with SQL Scripts and other codes | |||
Harvesting from HiveQL Scripting | |||
Harvesting from any supported SQL Scripting | |||
Business Intelligence (BI Reporting) | |||
Harvesting from Tableau or Qlik | |||
Harvesting from any supported Business Intelligence tools | |||
Harvesting from any supported Metadata Management tools (such as Apache Atlas or Cloudera Navigator) | |||
Business Applications | |||
Harvesting from Salesforce | |||
Harvesting from any supported Business Application tools (such as SAP Business Warehouse 4 HANA) |
For more information about the bridges, see Talend Data Catalog Bridges on Talend Help Center.
Before harvesting metadata
Before harvesting metadata, it is important to analyze where the metadata reside, what technology are required to extract them and what process to be followed in order to ensure a proper extraction.
Ensure that you have the proper connectivity to the external format metadata source.
Ensure that you have full access to any auxiliary resources. It depends on the external format you are attempting to connect to.
- Identify sources data stores, such as operational data stores.
- Identify data transformation process, such as ETL or ELT.
- Identify business intelligence systems.
- Identify existing conceptual models.
- Configure a bridge and harvest metadata for each system.
You should also organize your metadata repository with labeled folders, for example for each category of metadata.
Browsing the file system
Many import actions require pointing to files on the application server.
When configuring Talend Data Catalog, you have to specify the precise locations on the file system to include in the browse list.
You can specify the locations using Setup.bat or the command line.
The drives available for browsing are controlled by the conf.properties file.
Imported models and custom models
- Imported models are the models associated with an import bridge to be populated through the model harvesting process. These models are referred to as technical models. They are also considered business models when imported from business applications or business intelligence (BI) tools.
- Custom Models are instantiations of a custom model type in the metamodel. They are referred to as business models. They are also considered technical models depending on the domains.