Metadata Harvesting Model Bridge (MIMB) Setup
The Metadata Integration or Metadata Harvesting from third-party databases, data modeling, data integration or business intelligence tools is performed by the integrated Meta Integration® Model Bridge (MIMB) software. By default, the installer software deploys and configures both Talend Data Catalog and MIMB on the same machine, where the Talend Data Catalog Application Server accesses the MIMB Web Services locally. MIMB can also be installed and configured as a remote MIMB Agent on another machine. This is very useful in architecture deployments where the metadata management server is:
- deployed remotely on the cloud but needs to access metadata harvesting servers (agents) locally on premise, or
- deployed on Linux but needs to access metadata harvesting servers (agents) on a Windows machine where DM/DI/BI client tools are Windows only (e.g. COM based SDK).
Essential customizations (e.g. directories, memory) of the MIMB Application Server can be performed in the following configuration file:
$MM_HOME/conf/conf.properties
Recommended customizations include:
- M_BROWSE_PATH to browse local and mapped network drive.
- All metadata harvesting file and directory parameter references are relative to the server. The reason is that the server must have access to these resources anytime another event (e.g., scheduled harvest) is to occur. When harvesting a model, then, the UI presents a set of paths that may be browsed in order to select these files and directories. Setting the M_BROWSE_PATH parameter allows one to define which drives and network paths will be available in the UI. One may update the M_BROWSE_PATH using the UI (on the application server) presented by the $MM_HOME/Setup.sh utility (see also Application Server Execution and Initialization), or by editing the $MM_HOME/conf/conf.properties file directly.
- On installation, the set includes all directly attached drives., which is specified by an asterisk "* as follows M_BROWSE_PATH=*.
- Note for Windows based application servers: When running as a service, the drive names (mapped) and paths may not be the same as what a user sees when logged in, and thus the "*" value will not be see all drives you might expect when selecting drives using the UI. Instead, one must explicitly list all the drives and network paths that one wants to be available to all users in the UI. Also, it is not sufficient to simply enter the mapped drive id (e.g., "N:\"), as that drive mapping is also generally not available to services. Thus, one should specify the physical drives by letters, but must specify the network paths completely, such as M_BROWSE_PATH=C:\, E:\, \\network-drive\shared\
- Note that the above also applies even to script backup and restore drives.
- M_DATA_DIRECTORY to relocate the data such as the log files, and metadata incremental harvesting cache as needed for very large DI or BI tools.
- M_JAVA_OPTIONS to increase the maximum memory used by java bridges during the metadata harvesting of very large DB, DI or BI tools. Note that this parameter defines the default maximum for all java bridges, however most memory intensive java bridges (e.g JDBC bridges) have the ability to define its own maximum memory in their last parameter called Miscellaneous.
When the MIMB Application Server is used a local metadata harvesting agent connected to a Talend Data Catalog Application Server on the cloud, the additional customizations are needed in the $MM_HOME/conf/agent.properties configuration file where:
- M_SERVER_URL is the URL of the Application Server on the cloud such as M_SERVER_URL=http://server:11480/MM.
- M_AGENT_NAME is the agent’s name such as M_AGENT_NAME=MyCompanyOnPemise that the above Application Server will then use to refer to this metadata harvesting server agent.