Script Development / Basic Concepts
In DataFlux Func, there exist some concepts unique to DataFlux Func. This document will explain them.
1. Script Collections, Scripts, and Functions
Script collections, scripts, and functions can be created in "Development / Script Library". These are core concepts of DataFlux Func, and the IDs are directly specified by users when creating/writing code.
- A "script collection" is a set of several scripts. The ID is directly specified by the user upon creation and can only contain scripts.
- A "script" refers to the Python script itself, which must belong to a certain script collection. The ID is directly specified by the user upon creation.
- A "function" in DataFlux Func specifically refers to the top-level function decorated with the
@DFF.API(...)
decorator, which can be called as an entry point by synchronous/asynchronous APIs, scheduled tasks, and other modules.
A script collection is not a folder
A script collection is similar to a folder, but this "folder" is unrelated to typical Python coding folders.
When coding in DataFlux Func, you will frequently encounter the IDs of script collections, scripts, and functions, and these IDs are closely related.
Relationship Between Script Collection, Script, and Function IDs
According to the hierarchy of script collections, scripts, and functions, the ID of a lower-level concept always includes the ID of its upper-level concept.
Assume there exists a script collection with the ID demo
. Then all scripts belonging to this collection must start with demo__
(double underscores).
Furthermore, assume that under this script collection, there is a script with the ID demo__test
, which contains a function def hello(...)
. The ID of this function would then be demo__test.hello
.
The following table illustrates the ID examples:
Concept | Example ID |
---|---|
Script Collection | demo |
Script | demo__test |
Function | demo__test.hello |
Mutual References in Coding
In DataFlux Func scripts, it is allowed to reference another script to achieve code reuse.
Assume there exists a script demo__script_a
, which contains a function func_a()
. To reference this function in script demo__script_b
, the following method can be used:
demo__script_a | |
---|---|
1 2 |
|
demo__script_b | |
---|---|
1 2 3 4 |
|
Python's as
statement can also be used similarly:
demo__script_b | |
---|---|
1 2 3 4 |
|
You can also use the from ... import
statement to import only the required functions:
demo__script_b | |
---|---|
1 2 3 4 |
|
For references between scripts within the same script collection, the script collection ID can be ignored, and the abbreviated form starting with __
(double underscores) can be used:
demo__script_b | |
---|---|
1 2 3 4 |
|
Prefer Using Abbreviated Form
Mutual references within a script collection should preferentially use the abbreviated form (i.e., ignoring the script collection ID and starting with __
).
This way, even if the entire script collection is cloned and the script collection ID changes, the code within the new cloned script collection can still correctly reference the scripts within this collection.
2. Connectors
Connectors can be created in "Development / Connectors". They are tools provided by DataFlux Func for connecting external systems, and the ID is directly specified by the user upon creation.
Actually, writing Python code in DataFlux Func does not differ much from standard Python. Developers can completely ignore connectors and connect to external systems directly in their code.
However, for some external systems that involve connection pools, connectors have built-in connection pools that maintain connections during repeated function executions, avoiding the repeated creation/closure of connections to external systems.
Assume the user has already configured a connector with the ID mysql
. The code to obtain the operation object of this connector is as follows:
Python | |
---|---|
1 |
|
Different connectors have different operation methods and parameters. For details, please refer to Script Development / Connector Object DFF.CONN
3. Environment Variables
Environment variables can be created in "Development / Environment Variables". They are simple Key-Value configuration retrieval tools provided by DataFlux Func, and the ID is directly specified by the user upon creation.
Environment variables are particularly suitable for scenarios where the same set of code runs in different environments.
If the system accessed by the script distinguishes between testing and production environments, environment variables can be set to switch between testing and production environments without changing the code.
Assume the user has already configured an environment variable with the ID api_endpoint
. The code to retrieve the value of this environment variable is as follows:
Python | |
---|---|
1 |
|
4. Synchronous API (Old Version: Authorized Link)
Synchronous APIs can be created in "Manage / Synchronous API". It is a common way for external systems to call functions in DataFlux Func. The calling process executes synchronously, and after the function execution completes, the result can be returned directly to the caller.
After creating a synchronous API for a function, various different calling methods are supported.
Synchronous APIs support both GET
and POST
methods. Both methods support parameter passing in "simplified form" and "standard form".
Additionally, the "simplified" form of POST
supports file uploads. Below is the feature support list for various calling methods:
Calling Method | Passing kwargs Parameters |
kwargs Parameter Type |
Passing options |
File Upload | Submitting Arbitrary Format Body Available since 1.6.9 |
---|---|---|---|---|---|
GET Simplified Form |
Supported | Strings Only | Not Supported | Not Supported | Not Supported |
GET Standard Form |
Supported | JSON Data Types | Supported | Not Supported | Not Supported |
POST Simplified Form |
Supported | Strings Only | Not Supported | Supported | Supported |
POST Standard Form |
Supported | JSON Data Types | Supported | Not Supported | Not Supported |
Different passing methods may impose restrictions on parameter types
For calling methods where parameters in kwargs can only be strings, type conversion of parameters is required within the function. In the synchronous API list, you can click "API Call Example" to view specific calling methods.
Assume the existence of the following function:
Python | |
---|---|
1 2 3 |
|
For this function, the created synchronous API ID is auln-xxxxx
, and the passed parameters are x=100
(integer), y="hello"
(string).
Thus, the various calling methods are as follows:
GET Simplified Form Passing Parameters
If the function parameters are relatively simple, you can use the GET
simplified form to pass parameters, making the interface more intuitive.
Since URL parameters cannot distinguish between the string "100"
and the integer 100
, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.
Text Only | |
---|---|
1 |
|
For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.
GET Standard Form Passing Parameters
In some cases, if a POST request cannot be sent, the GET method can also be used to call the interface.
When passing parameters using the GET
standard form, the entire kwargs
is serialized into JSON and passed as a URL parameter.
Since the parameters are actually sent in JSON format, the original types of the parameters are retained, and the function does not need to perform type conversion.
In this example, the x
parameter received by the function is an integer, requiring no type conversion.
Text Only | |
---|---|
1 |
|
For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.
POST Simplified Form Passing Parameters
In some cases, if an HTTP request with a JSON body cannot be sent, parameters can be passed in a manner similar to a Form submission, where each field name is the parameter name.
Since Form submissions cannot distinguish between the string "100"
and the integer 100
, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.
Text Only | |
---|---|
1 2 3 4 |
|
Additionally, the POST
simplified form parameter passing supports file uploads (parameter/field name must be files
), which requires handling via form-data/multipart
.
Below is an HTML page code example:
HTML | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
POST Standard Form Passing Parameters
The POST
standard form parameter passing is the most common calling method.
Since parameters are sent in JSON format through the request body, the original types of the parameters are retained, and the function does not need to perform type conversion.
In this example, the x
parameter received by the function is an integer, requiring no type conversion.
Text Only | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
5. Asynchronous API (Old Version: Batch Processing)
Asynchronous APIs can be created in "Manage / Asynchronous API". It is another way for external systems to call functions in DataFlux Func.
The difference from "Synchronous API" lies in that after being called, asynchronous APIs immediately return, and the backend processes asynchronously. The caller does not need to wait for the result to return. Apart from this, the calling method is the same as "Synchronous API".
Asynchronous APIs provide longer function execution times (not API response times), making them very suitable for executing time-consuming asynchronous processing.
6. Scheduled Tasks (Old Version: Automatic Trigger Configuration)
Scheduled tasks can be created in "Manage / Scheduled Tasks" to let DataFlux Func automatically call functions periodically.
After creating a scheduled task for a function, the function will execute according to the specified Crontab expression without requiring external calls.
Because of this, all parameters of the executed function must already be satisfied, i.e.:
- The function does not require input parameters.
- The function requires input parameters, but they are optional parameters.
- The function requires mandatory parameters, and specific values are configured for them in the scheduled task.
Distinguishing Function Execution Based on Execution Features
If a function is simultaneously configured with "Scheduled Tasks" and other execution features, and you want to differentiate handling in different execution features, you can determine this by checking the built-in variable _DFF_CRONTAB
:
Python | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
7. API Authentication
Added in version 1.3.2
Asynchronous APIs can be created in "Manage / Asynchronous API", providing another way for external systems to call functions in DataFlux Func.
For HTTP APIs generated by "Synchronous / Asynchronous APIs", additional interface authentication can be added.
Currently supported interface authentications include:
Authentication Type | Description |
---|---|
Fixed Field | Verifies that the Header, Query, or Body of the request must contain a field with a specific value |
HTTP Basic | Standard HTTP Basic authentication (pops up a login box in browsers) |
HTTP Digest | Standard HTTP Digest authentication (pops up a login box in browsers) |
Authentication Function | Specifies a custom-written function as the authentication function |
Users can add authentication configurations in "Manage / API Authentication" and then specify the added authentication configurations in "Synchronous / Asynchronous API Configuration".
If high security is required, please ensure to access the interface using HTTPS
Fixed Field Authentication
Fixed field authentication is the simplest authentication method, where the client and DataFlux Func agree to include a specific field and field value somewhere in the request (Header, Query, or Body). This content is attached during each call to complete the authentication.
Assume that every request must include the header x-auth-token="my-auth-token"
. You can authenticate by calling in the following way:
Text Only | |
---|---|
1 2 |
|
When configuring multiple fixed field authentications, matching one is sufficient for authentication
For fields used for authentication in Query and Body, the system automatically deletes them after authentication, and they are not passed to the function
HTTP Basic / HTTP Digest
Authentication methods directly supported by browsers.
Interfaces authenticated using this method will prompt the user to enter a username/password when accessed directly in the browser address bar.
If you need to access programmatically, please refer to the following code:
Python | |
---|---|
1 2 3 4 5 6 7 8 |
|
Authentication Function
If the interface authentication method is complex or special (such as needing to integrate with business systems), you can choose to write your own function for authentication.
The function used for authentication must satisfy "having exactly one req
parameter as the request" and return True
or False
indicating success or failure.
The parameter req
is a dict
with the following structure:
Field Name | Type | Description |
---|---|---|
method | str | Request method (uppercase) Such as: "GET" , "POST" |
originalUrl | str | Original URL of the request. Includes the part after ? Such as: /api/v1/al/auln-xxxxx?q=1 |
url | str | Request URL. Does not include the part after ? Such as: /api/v1/al/auln-xxxxx |
headers | dict | Request Headers, field names are all lowercase |
query | dict | Request Query, field names and field values are all strings |
body | dict | Request Body |
hostname | str | Hostname accessed by the request. Does not include the port number Such as: example.com |
ip | str | Client IP Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations |
ips | list | List of client IPs and all intermediate proxy server IPs Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations |
ips[#] | str | Intermediate proxy server IP |
xhr | bool | Whether it is an AJAX request |
Example | |
---|---|
1 2 3 |
|