Skip to main content
Version: 2.14.0

FLOWX Engine setup guide

Introductionโ€‹

This guide will provide instructions on how to set up and configure the FLOWX Engine to meet your specific requirements.

Infrastructure prerequisitesโ€‹

The FLOWX Engine requires the following components to be set up before it can be started:

  • Docker engine - version 17.06 or higher
  • Kafka - version 2.8 or higher
  • Elasticsearch - version 7.11.0 or higher
  • DB instance

Dependenciesโ€‹

For Microservices architecture, some Microservices holds their data individually using separate Databases.

Databaseโ€‹

A basic Postgres configuration can be set up using a helm values.yaml file as it follows:

  • helm values.yaml:

      onboardingdb:
    existingSecret: {{secretName}}
    metrics:
    enabled: true
    service:
    annotations:
    prometheus.io/port: {{prometheus port}}
    prometheus.io/scrape: "true"
    type: ClusterIP
    serviceMonitor:
    additionalLabels:
    release: prometheus-operator
    enabled: true
    interval: 30s
    scrapeTimeout: 10s
    persistence:
    enabled: true
    size: 1Gi
    postgresqlDatabase: onboarding
    postgresqlExtendedConf:
    maxConnections: 200
    sharedBuffers: 128MB
    postgresqlUsername: postgres
    resources:
    limits:
    cpu: 6000m
    memory: 2048Mi
    requests:
    cpu: 200m
    memory: 512Mi
  • Redis server - a Redis cluster is required for the engine to cache process definitions, compiled scripts, and Kafka responses

  • Kafka cluster - Kafka is the backbone of the engine and all plugins and integrations are accessed via the Kafka broker

  • Additional dependencies - details about how to set up logging via Elasticsearch, monitoring, and tracing via Jaeger, can be found here

Configurationโ€‹

Configuring Kafkaโ€‹

Kafka handles all communication between the FLOWX Engine and external plugins and integrations. It is also used for notifying running process instances when certain events occur.

Both a producer and a consumer must be configured. The following Kafka-related configurations can be set by using environment variables:

  • SPRING_KAFKA_BOOTSTRAP_SERVERS - the address of the Kafka server, it should be in the format "host:port"

  • KAFKA_AUTH_EXCEPTION_RETRY_INTERVAL - the interval between retries after AuthorizationException is thrown by KafkaConsumer

  • KAFKA_MESSAGE_MAX_BYTES - this is the largest size of the message that can be received by the broker from a producer.

Consumer groups & consumer threadsโ€‹

In Kafka a consumer group is a group of consumers that jointly consume and process messages from one or more Kafka topics. Each consumer group has a unique identifier called a group ID, which is used by Kafka to manage message consumption and distribution among the members of the group.

Thread numbers, on the other hand, refer to the number of threads that a consumer application uses to process messages from Kafka. By default, each consumer instance runs in a single thread, which can limit the throughput of message processing. Increasing the number of consumer threads can help to improve the parallelism and efficiency of message consumption, especially when dealing with high message volumes.

Both group IDs and thread numbers can be configured in Kafka to optimize the processing of messages according to specific requirements, such as message volume, message type, and processing latency.

The configuration related to consumers (group ids and thread numbers) can be configured separately for each message type as it follows:

  • KAFKA_CONSUMER_GROUP_ID_NOTIFY_ADVANCE - related to a Kafka consumer group that receives messages related to notifying advance actions, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_NOTIFY_PARENT - related to a Kafka consumer group that receives messages related to notifying when a subprocess is blocked and it sends a notification to the parent process, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_ADAPTERS - related to a Kafka consumer group that receives messages related to adapters, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_SCHEDULER_RUN_ACTION - related to a Kafka consumer group that receives messages related to requests to run scheduled actions, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_SCHEDULER_ADVANCING- related to a Kafka consumer group that receives messages related to messages sent by the scheduler when a timeout expires, indicating that the advancing should continue (parallel advancing), it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_PROCESS_START - related to a Kafka consumer group that receives messages related to starting processes, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_PROCESS_EXPIRE - related to expiring processes, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_GROUP_ID_PROCESS_OPERATIONS - related to a Kafka consumer group that receives messages related to processing operations from task management, it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_THREADS_NOTIFY_ADVANCE - the number of threads used by a Kafka consumer application to notify the Kafka broker about the progress of

  • KAFKA_CONSUMER_THREADS_NOTIFY_PARENT - the number of threads used by a Kafka consumer application, related to a process when it is blocked

  • KAFKA_CONSUMER_THREADS_ADAPTERS- the number of threads used by a Kafka consumer application, related to adapters

  • KAFKA_CONSUMER_THREADS_SCHEDULER_RUN_ACTION - the number of threads used by a Kafka consumer application, related to requests to run scheduled actions

  • KAFKA_CONSUMER_THREADS_SCHEDULER_ADVANCING - the number of threads used by a Kafka consumer application, related to messages sent by the scheduler when a timeout expires, indicating that the advancing should continue (parallel advancing), it is used to configure the group ID for this consumer group

  • KAFKA_CONSUMER_THREADS_PROCESS_START - the number of threads used by a Kafka consumer application, related to starting processes

  • KAFKA_CONSUMER_THREADS_PROCESS_EXPIRE - the number of threads used by a Kafka consumer application, related to expiring processes

  • KAFKA_CONSUMER_THREADS_PROCESS_OPERATIONS - the number of threads used by a Kafka consumer application, related to processing operations from task management

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_CONSUMER_GROUP_ID_NOTIFY_ADVANCEnotif123-preview
KAFKA_CONSUMER_GROUP_ID_NOTIFY_PARENTnotif123-preview
KAFKA_CONSUMER_GROUP_ID_ADAPTERSnotif123-preview
KAFKA_CONSUMER_GROUP_ID_SCHEDULER_RUN_ACTIONnotif123-preview
KAFKA_CONSUMER_GROUP_ID_PROCESS_STARTnotif123-preview
KAFKA_CONSUMER_GROUP_ID_PROCESS_EXPIREnotif123-preview
KAFKA_CONSUMER_GROUP_ID_PROCESS_OPERATIONSnotif123-preview
KAFKA_CONSUMER_THREADS_NOTIFY_ADVANCE6
KAFKA_CONSUMER_THREADS_NOTIFY_PARENT6
KAFKA_CONSUMER_THREADS_ADAPTERS6
KAFKA_CONSUMER_THREADS_SCHEDULER_RUN_ACTION6
KAFKA_CONSUMER_THREADS_PROCESS_START6
KAFKA_CONSUMER_THREADS_PROCESS_EXPIRE6
KAFKA_CONSUMER_THREADS_PROCESS_OPERATIONS6

It is important to know that all the events that start with a configured pattern will be consumed by the engine. This makes it possible to create a new integration and connect it to the engine without changing the configuration of the engine.

  • KAFKA_TOPIC_PROCESS_NOTIFY_ADVANCE - Kafka topic used internally by the engine

  • KAFKA_TOPIC_PROCESS_NOTIFY_PARENT - topic used for sub-processes to notify parent process when finished

  • KAFKA_TOPIC_PATTERN - the topic name pattern that the Engine listens on for incoming Kafka events

  • KAFKA_TOPIC_LICENSE_OUT - the topic name used by the Engine to generate licensing-related details

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_PROCESS_NOTIFY_ADVANCEpaperflow-process-notify
KAFKA_TOPIC_PROCESS_NOTIFY_PARENTflowx-process-parent-notify
KAFKA_TOPIC_PATTERNro.flowx.updates.qa-.*
KAFKA_TOPIC_LICENSE_OUTai.flowx.license
  • KAFKA_TOPIC_TASK_OUT - used for sending notifications to the plugin

  • KAFKA_TOPIC_PROCESS_OPERATIONS_IN - used for receiving calls from the task management plugin

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_TASK_OUTai.flowx.task.in
KAFKA_TOPIC_PROCESS_OPERATIONS_INai.flowx.process.operations
ยปScheduler
  • KAFKA_TOPIC_PROCESS_EXPIRE_IN - the topic name that the Engine listens on for requests to expire processes

  • KAFKA_TOPIC_PROCESS_SCHEDULE_OUT_SET - the topic name used by the Engine to schedule a process expiration

  • KAFKA_TOPIC_PROCESS_SCHEDULE_OUT_STOP - the topic name used by the Engine to stop a process expiration

  • KAFKA_TOPIC_PROCESS_SCHEDULE_IN_RUN_ACTION - the topic name that the Engine listens on for requests to run scheduled actions

  • KAFKA_TOPIC_PROCESS_SCHEDULE_IN_ADVANCE - the topic name where Engine listens for events, where scheduler sends messages related to advancing through a database

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_PROCESS_EXPIRE_INai.flowx.process.expire
KAFKA_TOPIC_PROCESS_SCHEDULE_OUT_SETai.flowx.in.schedule.message.v1
KAFKA_TOPIC_PROCESS_SCHEDULE_OUT_STOPai.flowx.in.stop.scheduled.message.v1
KAFKA_TOPIC_PROCESS_SCHEDULE_IN_RUN_ACTIONai.flowx.action.run
KAFKA_TOPIC_PROCESS_SCHEDULE_IN_ADVANCEai.flowx.schedule.advancing
ยปUsing the scheduler
  • KAFKA_TOPIC_DATA_SEARCH_IN - the topic name that the Engine listens on for requests to search for processes

  • KAFKA_TOPIC_DATA_SEARCH_OUT - the topic name used by the Engine to reply after finding a process

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_DATA_SEARCH_INai.flowx.dev.core.trigger.search.data.v1
KAFKA_TOPIC_DATA_SEARCH_OUTai.flowx.dev.engine.receive.core.search.data.results.v1
  • KAFKA_TOPIC_AUDIT_OUT - topic key for sending audit logs. Default value: ai.flowx.audit.log
Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_AUDIT_OUTai.flowx.audit.log

Processes that can be started by sending messages to a Kafka topicโ€‹

  • KAFKA_TOPIC_PROCESS_START_IN - the Engine listens on this topic for requests to start a new process instance

  • KAFKA_TOPIC_PROCESS_START_OUT - used for sending out the reply after starting a new process instance

Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_PROCESS_START_INai.flowx.in.start.process
KAFKA_TOPIC_PROCESS_START_OUTai.flowx.out.start.process
Default parameter (env var)Default FLOWX.AI value (can be overwritten)
KAFKA_TOPIC_PROCESS_INDEX_OUTai.flowx.dev.core.index.process.v1

Configuring WebSocketsโ€‹

The engine communicates with the frontend application via WebSockets. The following environment variables need to be configured for the socket server connection details:

  • WEB_SOCKET_SERVER_URL_EXTERNAL - the external URL of the WebSocket server

  • WEB_SOCKET_SERVER_PORT - the port on which the WebSocket server is running

  • WEB_SOCKET_SERVER_PATH - the WebSocket server path

Configuring file upload sizeโ€‹

The maximum file size allowed for uploads can be set by using the following environment variables:

  • SPRING_SERVLET_MULTIPART_MAX_FILE_SIZE - maximum file size allowed for uploads

  • SPRING_SERVLET_MULTIPART_MAX_REQUEST_SIZE - maximum request size allowed for uploads

Configuring Advancing controllerโ€‹

To use advancing controller, the following env vars are needed for process-engine to connect to Advancing Postgres DB:

  • ADVANCING_DATASOURCE_JDBC_URL - environment variable used to configure a JDBC (Java database connectivity) data source, it specifies the connection URL for a particular database, including the server, port, database name, and any other connection parameters necessary

  • ADVANCING_DATASOURCE_USERNAME - environment variable used to authenticate the user access to the data source

  • ADVANCING_DATASOURCE_PASSWORD - environment variable used to set the password for a data source connection

ยปAdvancing controller setup

Configuring Schedulerโ€‹

Below you can find a configuration .yaml to use scheduler service together with FLOWX Engine:

scheduler:
processCleanup:
enabled: false
cronExpression: 0 */5 0-5 * * ? #every day during the night, every 5 minutes, at the start of the minute.
batchSize: 1000
masterElection:
cronExpression: 30 */3 * * * ? #master election every 3 minutes
websocket:
namespace:
cronExpression: 0 * * * * *
expireMinutes: 30

Below you can find a configuration .yaml to use scheduler service together with FLOWX Engine:

  • processCleanup: A configuration for cleaning up processes.
  • enabled specifies whether this feature is turned on or off.
  • cronExpression is a schedule expression that determines when the cleanup process runs. In this case, it runs every day during the night (between 12:00 AM and 5:59 AM) and every 5 minutes, at the start of the minute.
  • batchSize specifies the number of processes to be cleaned up in one batch.
  • masterElection: A configuration for electing a master.
  • websocket: A configuration for WebSocket connections.
  • expireMinutes specifies how long the WebSocket namespace is valid for (30 minutes in this case).

Was this page helpful?