Skip to main content Skip to complementary content

Talend Data Fabric architecture

Each of the operating principles can be isolated in different functional blocks. The following diagram describes the different types of blocks and their interoperability:

Architecture diagram of Talend Data Fabric.

Building and administrating

The CLIENTS block includes one or more Talend Studio APIs and Web browsers that can be on the same or on different machines.

From the Talend Studio API, end-users can carry out technical processes: data integration or data service processes, mediation Routes and services, and publish them on the Artifact Repository and data profiling analyses and reports regardless of data volume and process complexity.

The Talend Studio allows users to work on any project for which they have authorization. For more information, see Creating a project.

With MDM, the Talend Studio is also used by administrators to set up and operate a centralized master data repository. They can build data models that employ the necessary business and data rules to create a single "master" copy of the master data.

From a Web browser, end-users connect to the remotely based Talend Administration Center through a secured HTTP protocol. The end-user category in this description may include developers, project managers, administrators, and any other person involved in building data flows, Web, REST, and data services, and mediation Routes.

Each of these end-users will use either Talend Studio or Talend Administration Center or both of them depending on the company policy.

Additionally, from the Web Browser you access the Talend Data Preparation Web application. This is where you import your data, from local files or other sources, and cleanse or enrich it by creating new preparations on this data. You can also access the Talend Data Stewardship Web application. This is where campaign owners and data stewards manage campaigns and tasks. You can optionally access the Talend Dictionary Service server to add, remove, or edit the semantic types used on data in the Web applications.

The TALEND SERVERS and DATABASES blocks and the Git gray circle include a web-based Talend Administration Center (application server) connected to two shared repositories: one based on a Git server and one based on a database server (Admin).

Talend Administration Center enables the management and administration of all projects. Administration metadata (user accounts, access rights and project authorization for example) is stored in the database server and project metadata (Jobs, Routines, Routes, Services for example) is stored in the Git server (to easily share them between the different end-users).

Talend Administration Center also enables to configure the tasks that handle Job executions and triggers. It also looks after the Job generation and deployment to the execution servers. For more information, see Getting started with Talend Administration Center.

Talend Administration Center also includes the servers used by the Talend Web applications, namely Talend Data Preparation and Talend Data Stewardship, and also Talend Dictionary Service. The Talend Identity and Access Management server is used to enable Single Sign-On between those applications.

Finally, Talend Administration Center enables you to access and manage the Routes or Services created from Talend Studio and published into the Artifact Repository, and set up and monitor their deployment and execution into the Talend Runtime . For more information, see Executing Services, Routes, and data service Jobs, and applying Profiles from ESB Conductor.

There is also one Talend MDM Server where the live version of the master data is stored. The MDM Repository contains a working copy of the data, and can be stored locally (that is, on the same machine as the Talend Studio) or remotely, based on a Git server. The data from the MDM Repository must be deployed to the Talend MDM Server before it can be accessed by users of the Talend MDM Web UI.

Deploying and executing

The Artifact Repository gray circle represents the artifact repository that stores all the:
  • Software Updates available for download.
  • Routes and Services that are published from Talend Studio and are ready to be deployed and executed in Talend Runtime.
The TALEND EXECUTION SERVERS block represents the execution servers that run technical processes according to the execution scheduling set up in the Talend Administration Center Web application. Those execution servers can be of:
  • One or more Talend Runtimes (execution container) deployed inside your information system. The Talend Runtime deploys and executes the technical processes according to the set up defined in the Talend Administration Center Web application. Those processes are Jobs built from Talend Studio and centralized on the Git server, Routes, and Services retrieved from the artifact repository.

    If you have several Talend Runtime on which to deploy the Service and Route artifacts, you will be able to load balance their execution according to your needs. All instances of Talend Runtime will communicate between each other via the Service Locator to identify the one more likely to deploy and execute the artifacts set to deployment in Talend Administration Center. The Talend Runtime elected for the deployment will request for the artifacts to deploy and execute from the artifact repository and the artifact repository will thus send the artifacts requested along with all the dependencies needed for its/their execution to the Talend Runtime, that will deploy and execute it/them.

  • One or more Talend JobServer deployed inside your information system that run technical processes (Jobs) according to scheduled time, date, or event set in the Talend Administration Center Web application.

    The end-user can transfer technical processes to a remote execution server directly from Talend Studio (distant run).

    Information noteImportant:

    You must install the Talend JobServer files ("Agent"), delivered by Talend, on each of the execution servers to become operational.

    For more information, see Installing and configuring your Talend JobServer.

Monitoring

The Monitoring circle represents the monitoring: Talend Activity Monitoring Console and the Service Activity Monitoring.

Talend Activity Monitoring Console allows end-users to monitor the execution of technical processes. It provides detailed monitoring capabilities that can be used to consolidate log information collected, understand the interaction between underlying data flows, prevent faults that could be unexpectedly generated and support system management decisions. For more information on Talend Activity Monitoring Console, see Talend Activity Monitoring Console User Guide.

The Service Activity Monitoring allows end-users to monitor service calls. It provides monitoring and consolidated event information that can be used to understand the underlying requests and replies that compose the event, monitor faults that may be unexpectedly generated and support the system management decisions. For more information on the Service Activity Monitoring, see Accessing Service Activity Monitoring.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!