Skip to content

Script Development / Handling Large Data Responses DFF.RESP_LARGE_DATA

Added in version 1.3.0

When returning large amounts of data (MB level and above), directly using the return method may cause significant performance degradation due to internal system communication processing. In such cases, you can use DFF.RESP_LARGE_DATA(...) to improve performance.

Parameter Type Required / Default Description
data str/dict/list Required Specifies the data to be returned
content_type str None Specifies the response body type, such as json, text, html, etc.

When using this method, ensure that the Resource Catalog is configured and mounted correctly, and that all Web servers and worker units can access the same shared directory.

Common use cases are as follows:

Python
1
2
3
4
5
@DFF.API('Use Case 1')
def case_1():
    data = {} # Large data (MB level and above)

    return DFF.RESP_LARGE_DATA(data)

Principle Explanation

DataFlux Func is composed of Web servers and worker units connected through Redis as a message queue. Data returned directly via return is serialized and sent to the message queue, then returned to the caller by the Web server.

Due to JSON serialization/deserialization, Redis enqueueing/dequeueing, and internal network communication, performance can degrade significantly when dealing with large JSON data.

This function essentially performs the following operations at the underlying level: 1. Saves the data to be returned as a file in the download directory of the Resource Catalog. 2. Responds to the request as a "file download" (i.e., DFF.RESP_FILE mentioned above). 3. The Web server directly reads the file saved in step 1 from the Resource Catalog and returns it to the client.

By using this "detour" method, the internal communication processing is lightweight, thereby improving performance.

Performance Comparison

Below is a performance comparison when returning a JSON of approximately 3.5MB in size:

  • When returning JSON directly via return data, it takes 18 seconds.
Bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ time wget http://172.16.35.143:8089/api/v1/al/auln-Ljo3y8HMUl91
--2021-09-16 22:40:09--  http://172.16.35.143:8089/api/v1/al/auln-Ljo3y8HMUl91
Connecting to 172.16.35.143:8089... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3363192 (3.2M) [application/json]
Saving to: “auln-Ljo3y8HMUl91”

auln-Ljo3y8HMUl91            100%[=============================================>]   3.21M  --.-KB/s   in 0.06s

2021-09-16 22:40:27 (50.4 MB/s) - “auln-Ljo3y8HMUl91” saved [3363192/3363192])

wget http://172.16.35.143:8089/api/v1/al/auln-Ljo3y8HMUl91  0.00s user 0.02s system 0% cpu 18.321 total
  • When returning JSON via return DFF.RESP_LARGE_DATA(data), it takes less than 1 second.
Bash
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
$ time wget http://172.16.35.143:8089/api/v1/al/auln-HPrfGRKIhYET
--2021-09-16 22:40:50--  http://172.16.35.143:8089/api/v1/al/auln-HPrfGRKIhYET
Connecting to 172.16.35.143:8089... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3687382 (3.5M) [application/json]
Saving to: “auln-HPrfGRKIhYET”

auln-HPrfGRKIhYET            100%[=============================================>]   3.52M  --.-KB/s   in 0.02s

2021-09-16 22:40:50 (183 MB/s) - “auln-HPrfGRKIhYET” saved [3687382/3687382])

wget http://172.16.35.143:8089/api/v1/al/auln-HPrfGRKIhYET  0.00s user 0.02s system 12% cpu 0.174 total