Deployment and Maintenance / High Availability Deployment
DataFlux Func supports multi-instance deployment to meet high availability requirements.
This document mainly introduces how to directly install and deploy a high-availability DataFlux Func on servers.
- For installing DataFlux Func using Helm in k8s, please refer to Deployment and Maintenance / Installation and Deployment / Helm Deployment
- For information on scaling the DataFlux Func system, please refer to Deployment and Maintenance / Architecture, Scaling, and Resource Limiting
- For the detailed process of function execution, please refer to Script Development / Function Execution Process
When selecting a high availability solution for Redis, do not use 'Redis Cluster'. You can use 'Redis Master-Slave'.
If you have previously installed DataFlux Func in a single-machine mode and are switching to a high-availability deployment, please refer to Deployment and Maintenance / Backup and Migration / Database Migration for migration.
1. Multi-Replica Deployment
Both the Server and Worker services of DataFlux Func support multi-replica deployment to achieve high availability, scaling, and other requirements.
Generally, the bottleneck in function execution efficiency lies with the Worker service (i.e., Python code). Therefore, the Server service only needs to avoid a single point of failure, while the Worker service requires increasing the number of replicas based on actual business volume.
In a multi-replica deployment, it is necessary to ensure that the user-config.yaml file content is identical across all services, that they all connect to the same MySQL and Redis setup, and that the resource directory is mounted to the same storage.
Meanwhile, the Beat service, as the trigger for scheduled tasks, can and must run only 1 replica; otherwise, duplicate scheduled tasks may be generated.
flowchart TB
USER[User]
SERVER_1[Server 1]
SERVER_2[Server 2]
WORKER_1[Worker 1]
WORKER_2[Worker 2]
WORKER_3[Worker 3]
BEAT[Beat]
REDIS[Redis]
USER --HTTP Request--> SLB
SLB --HTTP Forwarding--> SERVER_1
SLB --HTTP Forwarding--> SERVER_2
SERVER_1 --Enqueue Function Execution Tasks--> REDIS
SERVER_2 --> REDIS
REDIS --Dequeue Function Execution Tasks--> WORKER_1
REDIS --Dequeue Function Execution Tasks--> WORKER_2
REDIS --Dequeue Function Execution Tasks--> WORKER_3
BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> REDIS
2. Fully Independent Primary-Backup Deployment
Let's not consider for now whether this deployment method truly qualifies as 'High Availability'. Assume there is indeed such a deployment requirement.
A fully independent primary-backup deployment essentially involves deploying two completely independent sets of DataFlux Func (with identical configurations for Secret, MySQL, and Redis in the user-config.yaml file).
Since the primary and backup DataFlux Func operate independently, the Beat services on both the primary and backup servers will trigger scheduled tasks in their respective environments, which will cause scheduled tasks to be triggered twice.
To avoid this issue, you can shut down the DataFlux Func on the backup node during normal operation, or write logic within your scripts to handle and prevent duplicate task execution.
flowchart TB
USER[User]
MAIN_NODE_SERVER[Primary Node Server]
MAIN_NODE_WORKER[Primary Node Worker]
MAIN_NODE_BEAT[Primary Node Beat]
MAIN_NODE_REDIS_QUEUE[Primary Node Redis Queue]
BACKUP_NODE_SERVER[Backup Node Server]
BACKUP_NODE_WORKER[Backup Node Worker]
BACKUP_NODE_BEAT[Backup Node Beat]
BACKUP_NODE_REDIS_QUEUE[Backup Node Redis Queue]
USER --HTTP Request--> SLB
SLB --HTTP Forwarding--> MAIN_NODE_SERVER
SLB -.-> BACKUP_NODE_SERVER
subgraph "Backup Node - Shut Down"
direction TB
BACKUP_NODE_SERVER --Enqueue Function Execution Tasks--> BACKUP_NODE_REDIS_QUEUE
BACKUP_NODE_REDIS_QUEUE --Dequeue Function Execution Tasks--> BACKUP_NODE_WORKER
BACKUP_NODE_BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> BACKUP_NODE_REDIS_QUEUE
end
subgraph Primary Node
direction TB
MAIN_NODE_SERVER --Enqueue Function Execution Tasks--> MAIN_NODE_REDIS_QUEUE
MAIN_NODE_REDIS_QUEUE --Dequeue Function Execution Tasks--> MAIN_NODE_WORKER
MAIN_NODE_BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> MAIN_NODE_REDIS_QUEUE
end