Script Development / Basic Concepts

In DataFlux Func, there exist some concepts unique to DataFlux Func. This document will explain them.

1. Script Collections, Scripts, and Functions

Script collections, scripts, and functions can be created in "Development / Script Library". These are core concepts of DataFlux Func, and the IDs are directly specified by users when creating/writing code.

A "script collection" is a set of several scripts. The ID is directly specified by the user upon creation and can only contain scripts.
A "script" refers to the Python script itself, which must belong to a certain script collection. The ID is directly specified by the user upon creation.
A "function" in DataFlux Func specifically refers to the top-level function decorated with the @DFF.API(...) decorator, which can be called as an entry point by synchronous/asynchronous APIs, scheduled tasks, and other modules.

A script collection is not a folder

A script collection is similar to a folder, but this "folder" is unrelated to typical Python coding folders.

When coding in DataFlux Func, you will frequently encounter the IDs of script collections, scripts, and functions, and these IDs are closely related.

Relationship Between Script Collection, Script, and Function IDs

According to the hierarchy of script collections, scripts, and functions, the ID of a lower-level concept always includes the ID of its upper-level concept.

Assume there exists a script collection with the ID demo. Then all scripts belonging to this collection must start with demo__ (double underscores).

Furthermore, assume that under this script collection, there is a script with the ID demo__test, which contains a function def hello(...). The ID of this function would then be demo__test.hello.

The following table illustrates the ID examples:

Concept	Example ID
Script Collection	`demo`
Script	`demo__test`
Function	`demo__test.hello`

Mutual References in Coding

In DataFlux Func scripts, it is allowed to reference another script to achieve code reuse.

Assume there exists a script demo__script_a, which contains a function func_a(). To reference this function in script demo__script_b, the following method can be used:

demo__script_a
def func_a():
    pass

demo__script_b
import demo__script_b

def test():
    return demo__script_b.func_a()

Python's as statement can also be used similarly:

demo__script_b
import demo__script_b as b

def test():
    return b.func_a()

You can also use the from ... import statement to import only the required functions:

demo__script_b
from demo__script_b import func_a

def test():
    return func_a()

For references between scripts within the same script collection, the script collection ID can be ignored, and the abbreviated form starting with __ (double underscores) can be used:

demo__script_b
from __script_b import func_a

def test():
    return func_a()

Prefer Using Abbreviated Form

Mutual references within a script collection should preferentially use the abbreviated form (i.e., ignoring the script collection ID and starting with __).

This way, even if the entire script collection is cloned and the script collection ID changes, the code within the new cloned script collection can still correctly reference the scripts within this collection.

2. Connectors

Connectors can be created in "Development / Connectors". They are tools provided by DataFlux Func for connecting external systems, and the ID is directly specified by the user upon creation.

Actually, writing Python code in DataFlux Func does not differ much from standard Python. Developers can completely ignore connectors and connect to external systems directly in their code.

However, for some external systems that involve connection pools, connectors have built-in connection pools that maintain connections during repeated function executions, avoiding the repeated creation/closure of connections to external systems.

Assume the user has already configured a connector with the ID mysql. The code to obtain the operation object of this connector is as follows:

Python
mysql = DFF.CONN('mysql')

Different connectors have different operation methods and parameters. For details, please refer to Script Development / Connector Object DFF.CONN

3. Environment Variables

Environment variables can be created in "Development / Environment Variables". They are simple Key-Value configuration retrieval tools provided by DataFlux Func, and the ID is directly specified by the user upon creation.

Environment variables are particularly suitable for scenarios where the same set of code runs in different environments.

If the system accessed by the script distinguishes between testing and production environments, environment variables can be set to switch between testing and production environments without changing the code.

Assume the user has already configured an environment variable with the ID api_endpoint. The code to retrieve the value of this environment variable is as follows:

Python
api_endpoint = DFF.ENV('api_endpoint')

4. Synchronous API (Old Version: Authorized Link)

Synchronous APIs can be created in "Manage / Synchronous API". It is a common way for external systems to call functions in DataFlux Func. The calling process executes synchronously, and after the function execution completes, the result can be returned directly to the caller.

After creating a synchronous API for a function, various different calling methods are supported.

Synchronous APIs support both GET and POST methods. Both methods support parameter passing in "simplified form" and "standard form".

Additionally, the "simplified" form of POST supports file uploads. Below is the feature support list for various calling methods:

Calling Method	Passing `kwargs` Parameters	`kwargs` Parameter Type	Passing `options`	File Upload	Submitting Arbitrary Format `Body` Available since `1.6.9`
`GET` Simplified Form	Supported	Strings Only	Not Supported	Not Supported	Not Supported
`GET` Standard Form	Supported	JSON Data Types	Supported	Not Supported	Not Supported
`POST` Simplified Form	Supported	Strings Only	Not Supported	Supported	Supported
`POST` Standard Form	Supported	JSON Data Types	Supported	Not Supported	Not Supported

Different passing methods may impose restrictions on parameter types

For calling methods where parameters in kwargs can only be strings, type conversion of parameters is required within the function. In the synchronous API list, you can click "API Call Example" to view specific calling methods.

Assume the existence of the following function:

Python
@DFF.API('My Function')
def my_func(x, y):
    pass

For this function, the created synchronous API ID is auln-xxxxx, and the passed parameters are x=100 (integer), y="hello" (string).

Thus, the various calling methods are as follows:

GET Simplified Form Passing Parameters

If the function parameters are relatively simple, you can use the GET simplified form to pass parameters, making the interface more intuitive.

Since URL parameters cannot distinguish between the string "100" and the integer 100, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.

Text Only
GET /api/v1/al/auln-xxxxx/simplified?x=100&y=hello

For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.

GET Standard Form Passing Parameters

In some cases, if a POST request cannot be sent, the GET method can also be used to call the interface.

When passing parameters using the GET standard form, the entire kwargs is serialized into JSON and passed as a URL parameter.

Since the parameters are actually sent in JSON format, the original types of the parameters are retained, and the function does not need to perform type conversion.

In this example, the x parameter received by the function is an integer, requiring no type conversion.

Text Only
GET /api/v1/al/auln-xxxxx?kwargs={"x":100,"y":"hello"}

For ease of reading, the example shows the content before URLEncode. Actual URL parameters need to be URLEncoded.

POST Simplified Form Passing Parameters

In some cases, if an HTTP request with a JSON body cannot be sent, parameters can be passed in a manner similar to a Form submission, where each field name is the parameter name.

Since Form submissions cannot distinguish between the string "100" and the integer 100, the parameters received by the function when called are all strings. The function needs to perform type conversion on the parameters itself.

Text Only
POST /api/v1/al/auln-xxxxx/simplified
Content-Type: x-www-form-urlencoded

x=100&y=hello

Additionally, the POST simplified form parameter passing supports file uploads (parameter/field name must be files), which requires handling via form-data/multipart.

Below is an HTML page code example:

HTML
<html>
    <body>
        <h1>File Upload</h1>
        <input id="file" type="file" name="files" required />
        <input id="submit" type="submit" value="Upload"/>
    </body>
    <script>
        // Synchronous API address (if this page and DataFlux Func are not on the same domain, write the full http://domain:port/api/v1/al/auln-xxxxx/simplified)
        // Note: Uploading files must use the simplified form synchronous API
        var AUTH_LINK_URL = '/api/v1/al/auln-xxxxx/simplified';

        document.querySelector('#submit').addEventListener('click', function(event) {
            // After clicking the upload button, generate a FormData object and send it as the request body
            var data = new FormData();
            data.append('x', '100');
            data.append('y', 'hello');
            data.append('files', document.querySelector('#file').files[0]);

            var xhr = new XMLHttpRequest();
            xhr.open('POST', AUTH_LINK_URL);
            xhr.send(data);
        });
    </script>
</html>

POST Standard Form Passing Parameters

The POST standard form parameter passing is the most common calling method.

Since parameters are sent in JSON format through the request body, the original types of the parameters are retained, and the function does not need to perform type conversion.

In this example, the x parameter received by the function is an integer, requiring no type conversion.

Text Only
POST /api/v1/al/auln-xxxxx
Content-Type: application/json

{
    "kwargs": {
        "x": 100,
        "y": "hello"
    }
}

5. Asynchronous API (Old Version: Batch Processing)

Asynchronous APIs can be created in "Manage / Asynchronous API". It is another way for external systems to call functions in DataFlux Func.

The difference from "Synchronous API" lies in that after being called, asynchronous APIs immediately return, and the backend processes asynchronously. The caller does not need to wait for the result to return. Apart from this, the calling method is the same as "Synchronous API".

Asynchronous APIs provide longer function execution times (not API response times), making them very suitable for executing time-consuming asynchronous processing.

6. Scheduled Tasks (Old Version: Automatic Trigger Configuration)

Scheduled tasks can be created in "Manage / Scheduled Tasks" to let DataFlux Func automatically call functions periodically.

After creating a scheduled task for a function, the function will execute according to the specified Crontab expression without requiring external calls.

Because of this, all parameters of the executed function must already be satisfied, i.e.:

The function does not require input parameters.
The function requires input parameters, but they are optional parameters.
The function requires mandatory parameters, and specific values are configured for them in the scheduled task.

Distinguishing Function Execution Based on Execution Features

If a function is simultaneously configured with "Scheduled Tasks" and other execution features, and you want to differentiate handling in different execution features, you can determine this by checking the built-in variable _DFF_CRONTAB:

Python
@DFF.API('My Function')
def my_func(x, y):
    result = x + y

    if _DFF_CRONTAB:
        # Output logs only in scheduled tasks
        print(f'x + y = {result}')

    return

7. API Authentication

Added in version 1.3.2

Asynchronous APIs can be created in "Manage / Asynchronous API", providing another way for external systems to call functions in DataFlux Func.

For HTTP APIs generated by "Synchronous / Asynchronous APIs", additional interface authentication can be added.

Currently supported interface authentications include:

Authentication Type	Description
Fixed Field	Verifies that the Header, Query, or Body of the request must contain a field with a specific value
HTTP Basic	Standard HTTP Basic authentication (pops up a login box in browsers)
HTTP Digest	Standard HTTP Digest authentication (pops up a login box in browsers)
Authentication Function	Specifies a custom-written function as the authentication function

Users can add authentication configurations in "Manage / API Authentication" and then specify the added authentication configurations in "Synchronous / Asynchronous API Configuration".

If high security is required, please ensure to access the interface using HTTPS

Fixed Field Authentication

Fixed field authentication is the simplest authentication method, where the client and DataFlux Func agree to include a specific field and field value somewhere in the request (Header, Query, or Body). This content is attached during each call to complete the authentication.

Assume that every request must include the header x-auth-token="my-auth-token". You can authenticate by calling in the following way:

Text Only
GET /api/v1/al/auln-xxxxx
x-auth-token: my-auth-token

When configuring multiple fixed field authentications, matching one is sufficient for authentication

For fields used for authentication in Query and Body, the system automatically deletes them after authentication, and they are not passed to the function

HTTP Basic / HTTP Digest

Authentication methods directly supported by browsers.

Interfaces authenticated using this method will prompt the user to enter a username/password when accessed directly in the browser address bar.

If you need to access programmatically, please refer to the following code:

Python
import requests
from requests.auth import HTTPBasicAuth, HTTPDigestAuth

# HTTP Basic Authentication
resp = requests.get(url_1, auth=HTTPBasicAuth('user', 'password'))

# HTTP Digest Authentication
resp = requests.get(url_2, auth=HTTPDigestAuth('user', 'password'))

Authentication Function

If the interface authentication method is complex or special (such as needing to integrate with business systems), you can choose to write your own function for authentication.

The function used for authentication must satisfy "having exactly one req parameter as the request" and return True or False indicating success or failure.

The parameter req is a dict with the following structure:

Field Name	Type	Description
method	str	Request method (uppercase) Such as: `"GET"`, `"POST"`
originalUrl	str	Original URL of the request. Includes the part after `?` Such as: `/api/v1/al/auln-xxxxx?q=1`
url	str	Request URL. Does not include the part after `?` Such as: `/api/v1/al/auln-xxxxx`
headers	dict	Request Headers, field names are all lowercase
query	dict	Request Query, field names and field values are all strings
body	dict	Request Body
hostname	str	Hostname accessed by the request. Does not include the port number Such as: `example.com`
ip	str	Client IP Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations
ips	list	List of client IPs and all intermediate proxy server IPs Note: This field is meaningful only with correct Nginx, Alibaba Cloud SLB configurations
ips[#]	str	Intermediate proxy server IP
xhr	bool	Whether it is an AJAX request

Example
@DFF.API('Authentication Function')
def my_auth_func(req):
    return req['headers']['x-auth-token'] == 'my-auth-token'