Skip to main content Skip to complementary content

Configuring affinity and nodeSelector rules for Dynamic Engine

Use nodeSelector and affinity rules in Dynamic Engine Helm charts to control pod scheduling across different node groups in your Kubernetes cluster.

Starting with Dynamic Engine v1.1.0, you can set specific affinity rules to isolate services on specific nodes for security, performance, or resource optimization purposes. For example, you can appropriate a large-size group of nodes for intense pod runs.

Dynamic Engine v1.1.0 supports both nodeSelector (simple label-based selection) and affinity rules (complex scheduling logic) in Helm values files.

Information noteImportant: This customization can be applied during initial chart installation or during upgrade.

Before you begin

  • The dynamic-engine-crd custom resource definitions must have been installed using the oci://ghcr.io/talend/helm/dynamic-engine-crd Helm chart. If not, run the following commands for the installation:
    1. Find the chart version to be used:
      • Run the following Helm command:
        helm show chart oci://ghcr.io/talend/helm/dynamic-engine-crd --version <engine_version>
      • See the version directly from Talend Management Console or check the Dynamic Engine changelog for the chart version included in your Dynamic Engine version.
      • Use an API call to the Dynamic Engine version endpoint.
    2. Run the following command to install the Helm chart of a given version:
      helm install dynamic-engine-crd oci://ghcr.io/talend/helm/dynamic-engine-crd --version <helm_chart_version>
      Replace <helm_chart_version> with the chart version supported by your Dynamic Engine version.

      Without specifying the version, you install the latest available dynamic-engine-crd chart version.

  • Your Kubernetes cluster must be configured with multiple node groups (node pools) with appropriate labels. See the related documentation from your cloud provider on how to set up this configuration, for example, if you are using Kubernetes with GKE, see Create and manage cluster and node pool labels.
  • You must have basic knowledge of Kubernetes affinity rules.

About this task

By default, Dynamic Engine pods are scheduled on any available nodes in your cluster. With node affinity and node selector rules, you can control pod placement to achieve specific objectives such as:

  • Isolate permanent services on smaller, dedicated nodes
  • Run workloads like Jobs, Routes, or Data Services on larger nodes with particular instance types
The example node labels to be used are:
  • instanceType=large
  • instanceType=medium
  • qlik-dynamic-engine-tasks=true

Procedure

  1. In the Kubernetes machine, unzip the Helm deployment zip file previously downloaded.
  2. Create a custom values file using nodeSelector for global scheduling rules.

    Example

    In this example, a global rule is defined to run permanent services for infrastructure on medium nodes. This rule schedules all Dynamic Engine and environment pods on nodes labeled instanceType=medium:
    cat <<EOF > custom-medium-instance-for-infrastructure-values.yaml
    global:
      nodeSelector:
        instanceType: medium
    EOF
    The nodeSelector is suitable for simple exact match; it does not support logic expressions.
  3. Alternatively, create a custom values file using affinity rules to achieve the same result with more advanced scheduling logic.

    Example

    This example achieves the same result as the nodeSelector approach but using nodeAffinity:
    cat <<EOF > custom-medium-instance-for-infrastructure-values.yaml
    global:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: instanceType
                    operator: In
                    values: medium
    EOF
    Information noteTip: nodeSelector and affinity rules can be combined. If you specify both, a node must satisfy both rules for the pod to be scheduled on it.
  4. Deploy the Dynamic Engine and Dynamic Engine environment with your global scheduling rules.
    helm upgrade --install dynamic-engine-$DYNAMIC_ENGINE_ID -f $DYNAMIC_ENGINE_ID-values.yaml  \
     -f custom-medium-instance-for-infrastructure-values.yaml \
     oci://myregistry.example.com/docker-ghcr-io-remote/talend/helm/dynamic-engine \
     --version $DYNAMIC_ENGINE_VERSION
    
    helm upgrade --install dynamic-engine-environment-$DYNAMIC_ENGINE_ENVIRONMENT_ID -f $DYNAMIC_ENGINE_ENVIRONMENT_ID-values.yaml  \
     -f custom-medium-instance-for-infrastructure-values.yaml \
     oci://myregistry.example.com/docker-ghcr-io-remote/talend/helm/dynamic-engine-environment \
     --version $DYNAMIC_ENGINE_VERSION
  5. For workload-specific scheduling, create another custom values file that constrain the pod runs on a large node group.

    Example

    In this example, a specific rule is defined to constraint data integration tasks and Data Service and Route tasks to large nodes with the qlik-dynamic-engine-tasks=true label:
    cat <<EOF > custom-jobDeployment-dataServiceRouteDeployment-nodeSelector-values.yaml
    configuration:
      jobDeployment:
        nodeSelector:
          instanceType: large
          qlik-dynamic-engine-tasks: "true"
      dataServiceRouteDeployment:
        nodeSelector:
          instanceType: large
          qlik-dynamic-engine-tasks: "true"
    EOF
  6. Deploy the Dynamic Engine environment with workload-specific scheduling rules.

    Example

    This workload rule is specific to the task processing services and so is only applied on a Dynamic Engine environment.
    helm upgrade --install dynamic-engine-environment-$DYNAMIC_ENGINE_ENVIRONMENT_ID \
     -f $DYNAMIC_ENGINE_ENVIRONMENT_ID-values.yaml  \
     -f custom-jobDeployment-dataServiceRouteDeployment-nodeSelector-values.yaml \
     oci://myregistry.example.com/docker-ghcr-io-remote/talend/helm/dynamic-engine-environment \
     --version $DYNAMIC_ENGINE_VERSION

Results

Once deployed successfully, all Dynamic Engine and environment pods are scheduled according to your custom affinity and nodeSelector rules. Permanent services for the infrastructure run on their designated nodes, and customer workloads run on nodes optimized for their resource requirements.

Troubleshooting:

If a pod (or pods) cannot be scheduled on a node, use the following steps to diagnose and resolve the issue.

  1. Identify the scheduling failure:

    Use the kubectl describe pod command to get detailed information about the scheduling error:

    kubectl describe pod <pod-name> -n <namespace>

    Look for the Events section, which provides clues about why the scheduler failed. Common reasons include:

    • Rules not matching any nodes
    • Insufficient CPU or memory resources on available nodes
    • Taint and toleration issues on nodes that prevent pod scheduling
  2. Verify rules:

    Ensure that the nodes have the required labels specified in your rules:

    kubectl get nodes --show-labels

    Compare the labels on your nodes with the labels specified in your values files.

  3. Verify resource constraints:

    One of the most common reasons for scheduling failures is insufficient resources (CPU, memory). Verify the resource requests and limits in your pods:

    kubectl describe nodes

    If your cluster does not have enough nodes to satisfy resource requests, scale up your cluster by adding more nodes.

  4. Verify the taint and toleration specifications on the designated nodes.

    Taints can prevent nodes from scheduling certain pods. If your nodes have taints and your pods do not have corresponding tolerations, pods cannot be scheduled. For further information about Kubernetes taints and tolerations, see Taints and Tolerations.

    Dynamic Engine does not currently support taints and tolerations in Helm charts. If you have tainted nodes, you must either remove the taints or use a different node group for Dynamic Engine deployment.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!