Deployment and Maintenance / High Availability Deployment

DataFlux Func supports multi-instance deployment to meet high availability requirements.

This document mainly introduces how to directly install and deploy a high-availability DataFlux Func on servers.

For installing DataFlux Func using Helm in k8s, please refer to Deployment and Maintenance / Installation and Deployment / Helm Deployment
For information on scaling the DataFlux Func system, please refer to Deployment and Maintenance / Architecture, Scaling, and Resource Limiting
For the detailed process of function execution, please refer to Script Development / Function Execution Process

When selecting a high availability solution for Redis, do not use 'Redis Cluster'. You can use 'Redis Master-Slave'.

If you have previously installed DataFlux Func in a single-machine mode and are switching to a high-availability deployment, please refer to Deployment and Maintenance / Backup and Migration / Database Migration for migration.

1. Multi-Replica Deployment

Both the Server and Worker services of DataFlux Func support multi-replica deployment to achieve high availability, scaling, and other requirements.

Generally, the bottleneck in function execution efficiency lies with the Worker service (i.e., Python code). Therefore, the Server service only needs to avoid a single point of failure, while the Worker service requires increasing the number of replicas based on actual business volume.

In a multi-replica deployment, it is necessary to ensure that the user-config.yaml file content is identical across all services, that they all connect to the same MySQL and Redis setup, and that the resource directory is mounted to the same storage.

Meanwhile, the Beat service, as the trigger for scheduled tasks, can and must run only 1 replica; otherwise, duplicate scheduled tasks may be generated.

flowchart TB
    USER[User]
    SERVER_1[Server 1]
    SERVER_2[Server 2]
    WORKER_1[Worker 1]
    WORKER_2[Worker 2]
    WORKER_3[Worker 3]
    BEAT[Beat]
    REDIS[Redis]

    USER --HTTP Request--> SLB
    SLB --HTTP Forwarding--> SERVER_1
    SLB --HTTP Forwarding--> SERVER_2
    SERVER_1 --Enqueue Function Execution Tasks--> REDIS
    SERVER_2 --> REDIS
    REDIS --Dequeue Function Execution Tasks--> WORKER_1
    REDIS --Dequeue Function Execution Tasks--> WORKER_2
    REDIS --Dequeue Function Execution Tasks--> WORKER_3

    BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> REDIS

2. Fully Independent Primary-Backup Deployment

Let's not consider for now whether this deployment method truly qualifies as 'High Availability'. Assume there is indeed such a deployment requirement.

A fully independent primary-backup deployment essentially involves deploying two completely independent sets of DataFlux Func (with identical configurations for Secret, MySQL, and Redis in the user-config.yaml file).

Since the primary and backup DataFlux Func operate independently, the Beat services on both the primary and backup servers will trigger scheduled tasks in their respective environments, which will cause scheduled tasks to be triggered twice.

To avoid this issue, you can shut down the DataFlux Func on the backup node during normal operation, or write logic within your scripts to handle and prevent duplicate task execution.

flowchart TB
    USER[User]
    MAIN_NODE_SERVER[Primary Node Server]
    MAIN_NODE_WORKER[Primary Node Worker]
    MAIN_NODE_BEAT[Primary Node Beat]
    MAIN_NODE_REDIS_QUEUE[Primary Node Redis Queue]
    BACKUP_NODE_SERVER[Backup Node Server]
    BACKUP_NODE_WORKER[Backup Node Worker]
    BACKUP_NODE_BEAT[Backup Node Beat]
    BACKUP_NODE_REDIS_QUEUE[Backup Node Redis Queue]

    USER --HTTP Request--> SLB
    SLB --HTTP Forwarding--> MAIN_NODE_SERVER
    SLB -.-> BACKUP_NODE_SERVER

    subgraph "Backup Node - Shut Down"
        direction TB
        BACKUP_NODE_SERVER --Enqueue Function Execution Tasks--> BACKUP_NODE_REDIS_QUEUE
        BACKUP_NODE_REDIS_QUEUE --Dequeue Function Execution Tasks--> BACKUP_NODE_WORKER

        BACKUP_NODE_BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> BACKUP_NODE_REDIS_QUEUE
    end

    subgraph Primary Node
        direction TB
        MAIN_NODE_SERVER --Enqueue Function Execution Tasks--> MAIN_NODE_REDIS_QUEUE
        MAIN_NODE_REDIS_QUEUE --Dequeue Function Execution Tasks--> MAIN_NODE_WORKER

        MAIN_NODE_BEAT --"Enqueue Function Execution Tasks\n(Scheduled)"--> MAIN_NODE_REDIS_QUEUE
    end