Configuring the engine memory and the number of concurrent executions
The Remote Engine Gen2 is
configured to function with 8G memory allocated to it. This has consequences in terms of
the number of pipeline and preparation executions that can happen concurrently on the
engine.
Execution error
If too many executions requests are received by the engine, the engine will probably
accept a number of pipeline executions and reject some of the last ones. If so, you will
get the following error:
Cannot submit pipeline <PIPELINE_NAME>, too many Livy sessions are used.
where <PIPELINE_NAME> is the name you gave to your pipeline.
For safety reasons the number of concurrent pipeline executions is limited, but this limit is configurable for the Remote Engine Gen2.
Configure the number of allowed concurrent executions
To do so, open the following file to edit it:
- <engine_directory>/default/.env if you are using the engine in the AWS USA, AWS Europe, AWS Asia-Pacific or Azure regions.
- <engine_directory>/eap/.env if you are using the engine as part of the Early Adopter Program.
And check the line:
LIVY_SERVER_SESSION_MAX_CREATION=<NB_SLOTS>
Depending on the resources available on the machine your engine is running on, you might
want to change this value. The value corresponds to the following formula to make sure
that only a certain amount of memory is dedicated to running pipelines, the rest being
available for other services of the
engine:
LIVY_SERVER_SESSION_MAX_CREATION=(memory - 4)/spark.driver.memory
where memory corresponds to the memory allocated to the engine, 4 corresponds to the 4GB of memory necessary for other services of the engine, and spark.driver.memory corresponds to the memory allocated to each pipeline execution (1GB by default).
The spark.driver.memory default value can be changed by adding the parameter and value in the Advanced configuration section of the Add run profile form in Talend Management Console.
Example:
You have installed the engine on a Docker environment that has 8GB
of memory, you are going to allocate 4GB to the spark driver memory so the formula
will be
(8-4)/4=1:
LIVY_SERVER_SESSION_MAX_CREATION=1