Deployment and Maintenance / High Availability Deployment
DataFlux Func supports multiple deployments to meet high availability requirements.
This article mainly introduces how to directly install and deploy a highly available DataFlux Func on the server.
- For information about using Helm to install DataFlux Func in k8s, please refer to Deployment and Maintenance / Installation Deployment / Helm Deployment
- For details about the specific execution process of functions, please refer to Deployment and Maintenance / Function Execution Process
- For information about system scaling of DataFlux Func, please refer to Deployment and Maintenance / Architecture, Scaling, and Resource Limiting
When selecting a high availability solution for Redis, do not use 'Cluster Edition Redis'. You can use 'Master-Slave Edition Redis'
If you have already installed DataFlux Func in a single-machine manner before switching to a high availability deployment, please refer to Deployment and Maintenance / Daily Maintenance / Migrating Databases for migration
1. Multi-replica Deployment
Both the Server and Worker services of DataFlux Func support multiple deployments to achieve high availability and scaling needs.
Generally, the execution efficiency bottleneck of functions is in the Worker service (i.e., Python code), so the Server service only needs to avoid being a single point of failure, while the Worker service needs to increase the number of replicas based on actual business volume.
When deploying multiple replicas, ensure that the content of all services' user-config.yaml
files are exactly the same, and they are connected to the same MySQL and Redis. The resource catalog should be mounted on the same storage.
At the same time, the Beat service, as the trigger for scheduled tasks, can and must only run one replica; otherwise, it may cause duplicate scheduled tasks.
flowchart TB
USER[User]
SERVER_1[Server 1]
SERVER_2[Server 2]
WORKER_1[Worker 1]
WORKER_2[Worker 2]
WORKER_3[Worker 3]
BEAT[Beat]
REDIS[Redis]
USER --HTTP Request--> SLB
SLB --HTTP Forwarding--> SERVER_1
SLB --HTTP Forwarding--> SERVER_2
SERVER_1 --Function Execution Task Enqueue--> REDIS
SERVER_2 --> REDIS
REDIS --Function Execution Task Dequeue--> WORKER_1
REDIS --Function Execution Task Dequeue--> WORKER_2
REDIS --Function Execution Task Dequeue--> WORKER_3
BEAT --"Function Execution Task Enqueue\n(Scheduled)"--> REDIS
2. Fully Independent Primary-Backup Deployment
For now, let's not consider whether this deployment method is truly considered 'high availability', assuming there is indeed such a deployment requirement
A fully independent primary-backup deployment actually involves deploying two completely separate DataFlux Func instances (with identical Secret, MySQL, and Redis configurations in the user-config.yaml
file).
Since the primary and backup DataFlux Func instances operate independently, the Beat services on both the primary and backup servers will trigger scheduled tasks in their respective environments, which will lead to duplicate triggering of scheduled tasks.
To avoid this issue, the DataFlux Func instance on the backup node can be turned off during normal operations, or scripts can be written to handle preventing task duplication.
flowchart TB
USER[User]
MAIN_NODE_SERVER[Primary Node Server]
MAIN_NODE_WORKER[Primary Node Worker]
MAIN_NODE_BEAT[Primary Node Beat]
MAIN_NODE_REDIS_QUEUE[Primary Node Redis Queue]
BACKUP_NODE_SERVER[Backup Node Server]
BACKUP_NODE_WORKER[Backup Node Worker]
BACKUP_NODE_BEAT[Backup Node Beat]
BACKUP_NODE_REDIS_QUEUE[Backup Node Redis Queue]
USER --HTTP Request--> SLB
SLB --HTTP Forwarding--> MAIN_NODE_SERVER
SLB -.-> BACKUP_NODE_SERVER
subgraph "Backup Node - Off"
direction TB
BACKUP_NODE_SERVER --Function Execution Task Enqueue--> BACKUP_NODE_REDIS_QUEUE
BACKUP_NODE_REDIS_QUEUE --Function Execution Task Dequeue--> BACKUP_NODE_WORKER
BACKUP_NODE_BEAT --"Function Execution Task Enqueue\n(Scheduled)"--> BACKUP_NODE_REDIS_QUEUE
end
subgraph Primary Node
direction TB
MAIN_NODE_SERVER --Function Execution Task Enqueue--> MAIN_NODE_REDIS_QUEUE
MAIN_NODE_REDIS_QUEUE --Function Execution Task Dequeue--> MAIN_NODE_WORKER
MAIN_NODE_BEAT --"Function Execution Task Enqueue\n(Scheduled)"--> MAIN_NODE_REDIS_QUEUE
end