Lucene search engine troubleshooting
The Talend Data Catalog search capabilities are implemented by a Lucene search engine with indexes located in <TDC_HOME>\TalendDataCatalog\data\search\.
If the Lucene search index directory has been lost, the Talend Data Catalog server will automatically recreate it.
If the Lucene search index has been corrupted for any reason (such as power outage during indexing, out of memory, concurrent write to the index), you can delete the search index directory and the server will automatically recreate it.
Although not officially supported, the Administrator can attempt to use the Lucene CheckIndex to "exorcise" corrupted documents from the index. You can follow these steps:
- Backup your Lucene index in the directory
<TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx.Information noteNote: Replace lucene-xxxxxxxx with the actual directory name of your search index.
- Change the directory to a temporary directory, such as c:\temp.
- Run the following
command:
mkdir CheckIndex cd CheckIndex <TDC_HOME>\TalendDataCatalog\jre\bin\jar -xvf <TDC_HOME>\TalendDataCatalog\tomcat\webapps\MM.war cd WEB-INF java -classpath "lib/*" -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex <TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx
- Verify the output of the above command to see if there is any corrupted segment.
- If there is a corrupted segment, run the same command above with an extra option
"-exorcise".
java -classpath "lib/*" -ea:org.apache.lucene... org.apache.lucene.index.CheckIndex <TDC_HOME>\TalendDataCatalog\data\search\lucene_xxxxxxxx -exorcise
- Delete the CheckIndex directory once finished.
For more information, see https://lucene.apache.org/core/7_7_2/core/org/apache/lucene/index/CheckIndex.html.