Skip to content

Deployment and Maintenance / Benchmark Performance Testing

This document mainly introduces how to perform benchmark performance testing on DataFlux Func.

1. Preface

All example results in this document were tested in the following hardware environment:

Description
Computer HP ProBook laptop computer
CPU AMD Ryzen 5 7530U with Radeon Graphics / 4.5 GHz
Memory 32 GB / 3200 MHz (0.3 ns)
OS Ubuntu 22.04.5 LTS

2. Benchmark Test Package

Download Benchmark.zip, and import it into DataFlux Func

The Benchmark test package contains the following contents:

Content Description
Script Set benchmark
Function benchmark__main.hello_world Empty function that directly returns "ok", used for concurrency testing
Function benchmark__main.json_dump Perform JSON serialization/deserialization, used for performance testing
Function benchmark__main.calc_pi Calculate pi, used for performance testing
Sync API benchmark-hello-world Used for concurrency testing
Cron Job benchmark-hello-world Used to test task scheduling performance
Cron Job benchmark-compute-pi Used to test computing performance (calculate pi)
Cron Job benchmark-json-dump-and-load Used to test computing performance (JSON serialization/deserialization)

3. Execute Tests

After importing the Benchmark test package, since cron jobs will run automatically, you can check the results from the task records after waiting a few minutes.

However, to test the concurrent performance of DataFlux Func, other tools are required.

3.1 Test Task Scheduling Performance

After importing the Benchmark test package, wait a few minutes and you can view the results in the "Hello, World" task record under "Cron Jobs".

Generally speaking, each task should take less than 10 milliseconds.

cron-job-hello-world.png

3.2 Test Computing Performance

After importing the Benchmark test package, wait a few minutes and you can view the results in the "Compute pi" and "JSON dump and load" task records under "Cron Jobs".

For the test environment mentioned in the preface, each task takes about 1 second:

  • Calculating Pi "Compute pi"

cron-job-compute-pi.png

  • JSON serialization/deserialization "JSON dump and load"

cron-job-json-dump-and-load.png

3.3 Test Function API Concurrency Performance

When testing concurrency performance, you can use the ab (ApacheBench) tool.

If ab is not installed yet, you can install it using the following commands:

Bash
1
apt-get install apache2-utils
Bash
1
yum install httpd-tools

Use ab to test the "Hello, World" in the test package with the following command:

Bash
1
ab -c 10 -n 5000 -k http://localhost:8088/api/v1/sync/benchmark-hello-world
Bash
1
ab -c 10 -n 5000 -k http://{DataFlux Func IP or domain}:8088/api/v1/sync/benchmark-hello-world

For the test environment mentioned in the preface, the concurrency of an empty function is around 1000 (Requests per second):

ab test output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# ab -c 10 -n 5000 -k http://localhost:8088/api/v1/sync/benchmark-hello-world
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
...
Completed 5000 requests
Finished 5000 requests


Server Software:
Server Hostname:        localhost
Server Port:            8088

Document Path:          /api/v1/sync/benchmark-hello-world
Document Length:        13 bytes

Concurrency Level:      10
Time taken for tests:   4.694 seconds
Complete requests:      5000
Failed requests:        0
Keep-Alive requests:    5000
Total transferred:      3045525 bytes
HTML transferred:       65000 bytes
Requests per second:    1065.17 [#/sec] (mean)
Time per request:       9.388 [ms] (mean)
Time per request:       0.939 [ms] (mean, across all concurrent requests)
Transfer rate:          633.60 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     5    9   1.8      9      29
Waiting:        4    9   1.8      9      29
Total:          5    9   1.8      9      29

Percentage of the requests served within a certain time (ms)
  50%      9
  66%     10
  75%     10
  80%     10
  90%     12
  95%     13
  98%     14
  99%     15
  100%     29 (longest request)

4. Reasons Affecting Performance

Since the actual execution content of tasks is determined by the scripts being called, if tasks are taking too long to execute, check the code for the following issues:

4.1 Excessive print(...)

For debugging convenience, someone might directly print data from databases or API responses using print(...) in the script.

Because DataFlux Func automatically logs the print(...) output during task execution, such operations can cause significant performance degradation.

It is recommended that only necessary print(...) statements be retained in production code, avoiding large text outputs.

4.2 Slow External System Response

When querying external databases or calling API interfaces (including executing DQL queries) in the script, if there is network latency or the target system simply responds slowly, this can also lead to longer task execution times.

This issue is unrelated to DataFlux Func and requires contacting the responsible parties for those external systems to improve response speed.

Here is a simple example code snippet to measure processing time:

Python
1
2
3
4
5
6
7
import time
import requests

def test():
    t1 = time.time()
    requests.get('http://github.com')
    print(f'Cost: {time.time() -  t1} s')