Skip to content

Script Development / Basic Concepts

In DataFlux Func, there exist some concepts unique to DataFlux Func. This document will explain them.

1. Script Collections, Scripts, and Functions

Script collections, scripts, and functions can be created in "Development / Script Library". These are core concepts of DataFlux Func, and the IDs are directly specified by users when creating/writing code.

  • A "script collection" is a set of several scripts. The ID is directly specified by the user upon creation and can only contain scripts.
  • A "script" refers to the Python script itself, which must belong to a certain script collection. The ID is directly specified by the user upon creation.
  • A "function" in DataFlux Func specifically refers to the top-level function decorated with the @DFF.API(...) decorator, which can be called as an entry point by synchronous/asynchronous APIs, scheduled tasks, and other modules.

A script collection is not a folder

A script collection is similar to a folder, but this "folder" is unrelated to typical Python coding folders.

When coding in DataFlux Func, you will frequently encounter the IDs of script collections, scripts, and functions, and these IDs are closely related.

Relationship Between Script Collection, Script, and Function IDs

According to the hierarchy of script collections, scripts, and functions, the ID of a lower-level concept always includes the ID of its upper-level concept.

Assume there exists a script collection with the ID demo. Then all scripts belonging to this collection must start with demo__ (double underscores).

Furthermore, assume that under this script collection, there is a script with the ID demo__test, which contains a function def hello(...). The ID of this function would then be demo__test.hello.

The following table illustrates the ID examples:

Concept Example ID
Script Collection demo
Script demo__test
Function demo__test.hello

Mutual References in Coding

In DataFlux Func scripts, it is allowed to reference another script to achieve code reuse.

Assume there exists a script demo__script_a, which contains a function func_a(). To reference this function in script demo__script_b, the following method can be used:

demo__script_a
1
2
def func_a():
    pass
demo__script_b
1
2
3
4
import demo__script_b

def test():
    return demo__script_b.func_a()

Python's as statement can also be used similarly:

demo__script_b
1
2
3
4
import demo__script_b as b

def test():
    return b.func_a()

You can also use the from ... import statement to import only the required functions:

demo__script_b
1
2
3
4
from demo__script_b import func_a

def test():
    return func_a()

For references between scripts within the same script collection, the script collection ID can be ignored, and the abbreviated form starting with __ (double underscores) can be used:

demo__script_b
1
2
3
4
from __script_b import func_a

def test():
    return func_a()

Prefer Using Abbreviated Form

Mutual references within a script collection should preferentially use the abbreviated form (i.e., ignoring the script collection ID and starting with __).

This way, even if the entire script collection is cloned and the script collection ID changes, the code within the new cloned script collection can still correctly reference the scripts within this collection.

2. Connectors

Connectors can be created in "Development / Connectors". They are tools provided by DataFlux Func for connecting external systems, and the ID is directly specified by the user upon creation.

Actually, writing Python code in DataFlux Func does not differ much from standard Python. Developers can completely ignore connectors and connect to external systems directly in their code.

However, for some external systems that involve connection pools, connectors have built-in connection pools that maintain connections during repeated function executions, avoiding the repeated creation/closure of connections to external systems.

Assume the user has already configured a connector with the ID mysql. The code to obtain the operation object of this connector is as follows:

Python
1
mysql = DFF.CONN('mysql')

Different connectors have different operation methods and parameters. For details, please refer to Script Development / Connector Object DFF.CONN

3. Environment Variables

Environment variables can be created in "Development / Environment Variables". They are simple Key-Value configuration retrieval tools provided by DataFlux Func, and the ID is directly specified by the user upon creation.

Environment variables are particularly suitable for scenarios where the same set of code runs in different environments.

If the system accessed by the script distinguishes between testing and production environments, environment variables can be set to switch between testing and production environments without changing the code.

Assume the user has already configured an environment variable with the ID api_endpoint. The code to retrieve the value of this environment variable is as follows:

Python
1
api_endpoint = DFF.ENV('api_endpoint')

4. Synchronous API (Old Version: Authorized Link)

Synchronous APIs can be created in "Manage / Synchronous API". It is a common way for external systems to call functions in DataFlux Func. The calling process executes synchronously, and after the function execution completes, the result can be returned directly to the caller.

After creating a synchronous API for a function, various different calling methods are supported.

Synchronous APIs support both GET and POST methods. Both methods support parameter passing in "simplified form" and "standard form".

Additionally, the "simplified" form of POST supports file uploads. Below is the feature support list for various calling methods:

Calling Method Passing kwargs Parameters kwargs Parameter Type Passing options File Upload Submitting Arbitrary Format Body
Available since 1.6.9
GET Simplified Form Supported Strings Only Not Supported Not Supported Not Supported
GET Standard Form Supported JSON Data Types Supported Not Supported Not Supported
POST Simplified Form Supported Strings Only Not Supported Supported Supported
POST Standard Form Supported JSON Data Types Supported Not Supported Not Supported

Different passing methods may impose restrictions on parameter types

For calling methods where parameters in kwargs can only be strings, type conversion of parameters is required within the function. In the synchronous API list, you can click "API Call Example" to view specific calling methods.

Assume the existence of the following function:

Python
1
2
3
@DFF.API('My Function')
def my_func(x, y):
    pass

For this function, the created synchronous API ID is auln-xxxxx, and the passed parameters are x=100 (integer), y="hello" (string).

Thus, the various calling methods are as follows:

GET Simplified Form Passing Parameters

If the function parameters are relatively simple, you can use the GET simplified form to pass parameters, making the interface more intuitive.

Since URL parameters cannot distinguish between the string "100" and the integer 100, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.

Text Only
1
GET /api/v1/al/auln-xxxxx/simplified?x=100&y=hello

For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.

GET Standard Form Passing Parameters

In some cases, if a POST request cannot be sent, the GET method can also be used to call the interface.

When passing parameters using the GET standard form, the entire kwargs is serialized into JSON and passed as a URL parameter.

Since the parameters are actually sent in JSON format, the original types of the parameters are retained, and the function does not need to perform type conversion.

In this example, the x parameter received by the function is an integer, requiring no type conversion.

Text Only
1
GET /api/v1/al/auln-xxxxx?kwargs={"x":100,"y":"hello"}

For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.

POST Simplified Form Passing Parameters

In some cases, if an HTTP request with a JSON body cannot be sent, parameters can be passed in a manner similar to a Form submission, where each field name is the parameter name.

Since Form submissions cannot distinguish between the string "100" and the integer 100, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.

Text Only
1
2
3
4
POST /api/v1/al/auln-xxxxx/simplified
Content-Type: x-www-form-urlencoded

x=100&y=hello

Additionally, the POST simplified form parameter passing supports file uploads (parameter/field name must be files), which requires handling via form-data/multipart.

Below is an HTML page code example:

HTML
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
<html>
    <body>
        <h1>File Upload</h1>
        <input id="file" type="file" name="files" required />
        <input id="submit" type="submit" value="Upload"/>
    </body>
    <script>
        // Synchronous API address (if this page and DataFlux Func are not on the same domain, write the full http://domain:port/api/v1/al/auln-xxxxx/simplified)
        // Note: Uploading files must use the simplified form synchronous API
        var AUTH_LINK_URL = '/api/v1/al/auln-xxxxx/simplified';

        document.querySelector('#submit').addEventListener('click', function(event) {
            // After clicking the upload button, generate a FormData object and send it as the request body
            var data = new FormData();
            data.append('x', '100');
            data.append('y', 'hello');
            data.append('files', document.querySelector('#file').files[0]);

            var xhr = new XMLHttpRequest();
            xhr.open('POST', AUTH_LINK_URL);
            xhr.send(data);
        });
    </script>
</html>

POST Standard Form Passing Parameters

The POST standard form parameter passing is the most common calling method.

Since parameters are sent in JSON format through the request body, the original types of the parameters are retained, and the function does not need to perform type conversion.

In this example, the x parameter received by the function is an integer, requiring no type conversion.

Text Only
1
2
3
4
5
6
7
8
9
POST /api/v1/al/auln-xxxxx
Content-Type: application/json

{
    "kwargs": {
        "x": 100,
        "y": "hello"
    }
}

5. Asynchronous API (Old Version: Batch Processing)

Asynchronous APIs can be created in "Manage / Asynchronous API". It is another way for external systems to call functions in DataFlux Func.

The difference from "Synchronous API" lies in that after being called, asynchronous APIs immediately return, and the backend processes asynchronously. The caller does not need to wait for the result to return. Apart from this, the calling method is the same as "Synchronous API".

Asynchronous APIs provide longer function execution times (not API response times), making them very suitable for executing time-consuming asynchronous processing.

6. Scheduled Tasks (Old Version: Automatic Trigger Configuration)

Scheduled tasks can be created in "Manage / Scheduled Tasks" to let DataFlux Func automatically call functions periodically.

After creating a scheduled task for a function, the function will execute according to the specified Crontab expression without requiring external calls.

Because of this, all parameters of the executed function must already be satisfied, i.e.:

  1. The function does not require input parameters.
  2. The function requires input parameters, but they are optional parameters.
  3. The function requires mandatory parameters, and specific values are configured for them in the scheduled task.

Distinguishing Function Execution Based on Execution Features

If a function is simultaneously configured with "Scheduled Tasks" and other execution features, and you want to differentiate handling in different execution features, you can determine this by checking the built-in variable _DFF_CRONTAB:

Python
1
2
3
4
5
6
7
8
9
@DFF.API('My Function')
def my_func(x, y):
    result = x + y

    if _DFF_CRONTAB:
        # Output logs only in scheduled tasks
        print(f'x + y = {result}')

    return

7. API Authentication

Added in version 1.3.2

Asynchronous APIs can be created in "Manage / Asynchronous API", providing another way for external systems to call functions in DataFlux Func.

For HTTP APIs generated by "Synchronous / Asynchronous APIs", additional interface authentication can be added.

Currently supported interface authentications include:

Authentication Type Description
Fixed Field Verifies that the Header, Query, or Body of the request must contain a field with a specific value
HTTP Basic Standard HTTP Basic authentication (pops up a login box in browsers)
HTTP Digest Standard HTTP Digest authentication (pops up a login box in browsers)
Authentication Function Specifies a custom-written function as the authentication function

Users can add authentication configurations in "Manage / API Authentication" and then specify the added authentication configurations in "Synchronous / Asynchronous API Configuration".

If high security is required, please ensure to access the interface using HTTPS

Fixed Field Authentication

Fixed field authentication is the simplest authentication method, where the client and DataFlux Func agree to include a specific field and field value somewhere in the request (Header, Query, or Body). This content is attached during each call to complete the authentication.

Assume that every request must include the header x-auth-token="my-auth-token". You can authenticate by calling in the following way:

Text Only
1
2
GET /api/v1/al/auln-xxxxx
x-auth-token: my-auth-token

When configuring multiple fixed field authentications, matching one is sufficient for authentication

For fields used for authentication in Query and Body, the system automatically deletes them after authentication, and they are not passed to the function

HTTP Basic / HTTP Digest

Authentication methods directly supported by browsers.

Interfaces authenticated using this method will prompt the user to enter a username/password when accessed directly in the browser address bar.

If you need to access programmatically, please refer to the following code:

Python
1
2
3
4
5
6
7
8
import requests
from requests.auth import HTTPBasicAuth, HTTPDigestAuth

# HTTP Basic Authentication
resp = requests.get(url_1, auth=HTTPBasicAuth('user', 'password'))

# HTTP Digest Authentication
resp = requests.get(url_2, auth=HTTPDigestAuth('user', 'password'))

Authentication Function

If the interface authentication method is complex or special (such as needing to integrate with business systems), you can choose to write your own function for authentication.

The function used for authentication must satisfy "having exactly one req parameter as the request" and return True or False indicating success or failure.

The parameter req is a dict with the following structure:

Field Name Type Description
method str Request method (uppercase)
Such as: "GET", "POST"
originalUrl str Original URL of the request. Includes the part after ?
Such as: /api/v1/al/auln-xxxxx?q=1
url str Request URL. Does not include the part after ?
Such as: /api/v1/al/auln-xxxxx
headers dict Request Headers, field names are all lowercase
query dict Request Query, field names and field values are all strings
body dict Request Body
hostname str Hostname accessed by the request. Does not include the port number
Such as: example.com
ip str Client IP
Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations
ips list List of client IPs and all intermediate proxy server IPs
Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations
ips[#] str Intermediate proxy server IP
xhr bool Whether it is an AJAX request
Example
1
2
3
@DFF.API('Authentication Function')
def my_auth_func(req):
    return req['headers']['x-auth-token'] == 'my-auth-token'