Skip to content

Deployment and Maintenance / Benchmark Performance Testing

This article mainly introduces how to perform benchmark performance testing on DataFlux Func.

1. Preface

All example results in this article were tested in the following hardware environment:

Description
Computer HP ProBook Laptop
CPU AMD Ryzen 5 7530U with Radeon Graphics / 4.5 GHz
Memory 32 GB / 3200 MHz (0.3 ns)
Operating System Ubuntu 22.04.5 LTS

2. Benchmark Test Package

Download Benchmark.zip and import it into DataFlux Func.

The Benchmark test package contains the following:

Content Description
Script Set benchmark
Function benchmark__main.hello_world An empty function that directly returns "ok", used for concurrency testing
Function benchmark__main.json_dump Performs JSON serialization/deserialization, used for performance testing
Function benchmark__main.calc_pi Calculates pi, used for performance testing
Sync API benchmark-hello-world Used for concurrency testing
Scheduled Task benchmark-hello-world Used to test task scheduling performance
Scheduled Task benchmark-compute-pi Used to test computational performance (calculating pi)
Scheduled Task benchmark-json-dump-and-load Used to test computational performance (JSON serialization/deserialization)

3. Executing Tests

After importing the Benchmark test package, since the scheduled tasks run automatically, you can view the results from the task records after waiting a few minutes.

However, concurrency performance testing for DataFlux Func requires the use of other tools.

3.1 Testing Task Scheduling Performance

After importing the Benchmark test package and waiting a few minutes, you can view the results in the task records of the "Hello, World" scheduled task under "Scheduled Tasks".

Generally, the execution time for each task should be within 10 milliseconds.

cron-job-hello-world.png

3.2 Testing Computational Performance

After importing the Benchmark test package and waiting a few minutes, you can view the results in the task records of the "Compute pi" and "JSON dump and load" scheduled tasks under "Scheduled Tasks".

For the test environment mentioned in the "Preface", each task's execution time is around 1 second:

  • Calculating pi "Compute pi"

cron-job-compute-pi.png

  • JSON serialization/deserialization "JSON dump and load"

cron-job-json-dump-and-load.png

3.3 Testing Function API Concurrency Performance

When testing concurrency performance, you can use the ab (ApacheBench) tool.

If ab is not installed, you can install it using the following commands:

Bash
1
apt-get install apache2-utils
Bash
1
yum install httpd-tools

Use ab to call the "Hello, World" function in the test package for testing. The commands are as follows:

Bash
1
ab -c 10 -n 5000 -k http://localhost:8088/api/v1/sync/benchmark-hello-world
Bash
1
ab -c 10 -n 5000 -k http://{DataFlux Func IP or domain}:8088/api/v1/sync/benchmark-hello-world

For the test environment mentioned in the "Preface", the concurrency for the empty function is around 1000 (Requests per second):

ab test output
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# ab -c 10 -n 5000 -k http://localhost:8088/api/v1/sync/benchmark-hello-world
This is ApacheBench, Version 2.3 <$Revision: 1879490 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Completed 500 requests
Completed 1000 requests
...
Completed 5000 requests
Finished 5000 requests


Server Software:
Server Hostname:        localhost
Server Port:            8088

Document Path:          /api/v1/sync/benchmark-hello-world
Document Length:        13 bytes

Concurrency Level:      10
Time taken for tests:   4.694 seconds
Complete requests:      5000
Failed requests:        0
Keep-Alive requests:    5000
Total transferred:      3045525 bytes
HTML transferred:       65000 bytes
Requests per second:    1065.17 [#/sec] (mean)
Time per request:       9.388 [ms] (mean)
Time per request:       0.939 [ms] (mean, across all concurrent requests)
Transfer rate:          633.60 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       1
Processing:     5    9   1.8      9      29
Waiting:        4    9   1.8      9      29
Total:          5    9   1.8      9      29

Percentage of the requests served within a certain time (ms)
  50%      9
  66%     10
  75%     10
  80%     10
  90%     12
  95%     13
  98%     14
  99%     15
  100%     29 (longest request)

4. Factors Affecting Performance

Since the actual execution content of a task is determined by the script being called, if you encounter long task execution times, you can check if the code has the following issues:

4.1 Excessive print(...)

For debugging convenience, some might output all data from the DB or API responses directly via print(...) in the script.

Since DataFlux Func automatically records the print(...) output of task execution, such operations can cause significant performance degradation.

It is recommended to keep only necessary print(...) outputs after the code is put into production and avoid outputting large blocks of text.

4.2 Slow External System Response

When querying external databases or calling API interfaces (including executing DQL queries, etc.) within a script, encountering network lag or simply slow response from the other system can also lead to long task execution times.

This is largely unrelated to DataFlux Func. You should contact the responsible parties for these external systems to improve response speed.

Here is a simple example code to test processing time:

Python
1
2
3
4
5
6
7
import time
import requests

def test():
    t1 = time.time()
    requests.get('http://github.com')
    print(f'Cost: {time.time() -  t1} s')