Cleaning up the repository
You need to clean up the repository to prepare the Talend Data Catalog server upgrade.
The repository can contain obsolete or unused content. As this content is live and indexed, it has an impact on the database performance and space.
Ensure that the database has at least 20% free space. The upgrade process may take several hours on large repositories and also need extra space for temp data during the migration.
Work with your repository database administrator to ensure that the database is cleaned.
Here are the actions you can perform to free up some space.
Deleting unused test or sandbox type content
- Browse through the repository to identify the unused content.
- Delete it from the repository manager.
Deleting unused versions of configurations
You keep a copy of configurations created for backup or historical analysis purposes or a new version is created each time the entire metadata is harvested in your configuration management process.
These copies can impact the database space and performance. They also consume resources such as the disk space, index size or performance of search.
- Go to .
- Run the Get repository configuration statistics operation
from the Operations drop-down list.
If there is a large ratio between the number of configuration versions and the total number of configurations, perform the following steps.
- Browse through the repository to identify the older versions.
- Delete them from the repository manager.
Deleting unused versions of models
- Browse through the repository to identify the older versions.
- Go to .
- Configure and run the Delete unused versions
operation.
This operation deletes a version of a model if this version is not used in a version of the configuration and if this version has been imported more than an hour and before a specified number of days.
Verifying that the incremental harvesting option is enabled in the model setup
The incremental harvesting option saves the processing time during the import and consumes less space. Only the part of the model that has changed is re-imported and written as a new version to the repository database. The rest of the content is reused in the new version. It applies to large databases, file systems and Business Intelligence servers.
This option can be disabled manually by adding the -cache.clear option in the Miscellaneous parameter.
- Open the import setup of each model.
- If you see the -cache.clear option in the Miscellaneous parameter, remove it.
- Save your changes.
Deleting the operation logs
Operation logs are not indexed and should not affect the performance but they can take significant space. It applies to large databases, file systems and Business Intelligence servers.
- Go to .
- Configure and run the Delete operation logs
operation.
This operation deletes completed operations and their logs older than a specified number of days. You can delete logs of failed operations or logs of successful and failed operations.
Disabling the Debug logging option in Manage System
Debug logs are not indexed and should not affect the performance but they can take significant space. It applies to large databases, file systems and Business Intelligence servers.
- Go to .
- In the Debug logging field, select Disable from the drop-down list.
Running the database maintenance operation
- Go to .
- Configure and run the Run database maintenance
operation.
This operation allows to maintain database indexes and statistics.
If a large number of contents and versions are deleted at once, you should execute the operation several times.
You are now ready to update Talend Data Catalog with the latest patches.