Talend Data Integration and Data Quality functional architecture
Talend products functional architecture is an architectural model that identifies Talend Studio functions, interactions and corresponding IT needs. The overall architecture has been described by isolating specific functionalities in functional blocks.
The following chart illustrates the main architectural functional blocks explored within Talend Studio.
Several functional blocks are defined:
- The Clients block includes one or more Talend Studio(s) and Web browsers that could be on the same or on different machines.
From Talend Studio, you carry out:
- data integration processes from the Integration perspective
- data quality analyses from the Profiling perspective
From the Web browser, you connect to the Talend server Talend Administration Center through a secured HTTP protocol.
- The Servers block includes a web-based Talend Administration Center
connected to:
- two shared repositories: one based on a Git server and one based on an Artifact repository,
- databases: one for administration metadata, one for audit information, and one for activity monitoring,
- Talend execution server(s).
Talend Administration Center enables the management and administration of all projects. Administration metadata (user accounts, access rights and project authorization for example) is stored in the Administration database. Project metadata (Jobs and Routines for example) is stored in the Git server. For more information, see Managing projects.
- The Repositories block includes the Git server and the
artifact repository. The Git server is used to centralize all project metadata like
Jobs shared between different end-users, and accessible from Talend Studio
to develop them and from Talend Administration Center
to publish, deploy and monitor them.
The artifact repository is used to store:
- Software updates available for download,
- Jobs that are published from the Talend Studio and are ready to be deployed and executed.
- The
Talend
Execution Servers block includes one or more execution servers, deployed
inside your information system. Talend Jobs
are deployed to the Job servers through the Administration Center's Job Conductor to be
executed on scheduled time, date, or event.
For more information about execution servers, see Configuring execution servers.
- The Databases block includes the following databases:
- The Administration database is used to manage user accounts, access rights and project authorization, and so on. The Audit database is used to evaluate different aspects of the Jobs implemented in projects realized in Talend Studio with the aim of providing solid quantitative and qualitative factors for process-oriented decision support. The Monitoring databases include the Talend Activity Monitoring Console database and the Service Activity Monitoring database.
- The Talend Activity Monitoring Console allows you to monitor the execution of technical processes. It provides detailed monitoring capabilities that can be used to consolidate collected log information, understand the underlying data flows interaction, prevent faults that could be unexpectedly generated and support the system management decisions.
- The Service Activity Monitoring allows you to monitor service calls. It provides monitoring and consolidated event information that the end-user can use to understand the underlying requests and replies that compose the event, monitor faults and support the system management decisions.
- The Datamart stores all data generated by different data quality reports in Talend Studio.