Script Development / Pre-execution Scripts
Added in version 2.6.18
Although DataFlux Func provides PIP tool for installing third-party Python packages, it may not work properly due to missing dependency libraries.
For example, when users need to use OpenCV in DataFlux Func, in addition to installing opencv-python
, it is also necessary to install dependency libraries via apt or other methods.
Otherwise, the following issues may occur:
1. Using Pre-execution Scripts to Resolve Dependencies
To address this issue, pre-execution scripts can be provided to DataFlux Func. These scripts will be executed before DataFlux Func starts, allowing the installation of necessary dependencies.
The specific steps are as follows:
1.1 Preparing the Script
The DataFlux Func image is based on Ubuntu:22.04.
To solve the OpenCV dependency issue mentioned above, the following Bash script can be prepared.
Install OpenCV Dependencies | |
---|---|
1 2 |
|
Save this as a file named prepare-for-opencv.sh
.
The file name can be arbitrary, but it must end with .sh
. To avoid unnecessary issues, do not use Unicode or other non-standard symbols in the file name.
Differentiating Execution Environments
DataFlux Func consists of a Server and a Worker.
- The Server is primarily an HTTP server, providing web pages and HTTP APIs, and does not participate in Python code execution.
- The Worker is the actual Python code execution service.
Therefore, in most cases, pre-execution scripts only need to be executed in the Worker.
To differentiate the environment, you can read $1
to determine if the value is server
or worker
.
Beat services, MySQL services, and Redis services do not execute pre-execution scripts.
Reference Bash code for differentiation:
Execute Only in Worker Container | |
---|---|
1 2 3 |
|
Execute Only in Server Container | |
---|---|
1 2 3 |
|
1.2 Uploading the Script
The pre-execution script storage directory is as follows:
Environment | Location |
---|---|
Inside the container | /data/resources/pre-run-scripts/ |
Inside the host machine | {installation directory}/data/resources/pre-run-scripts/ |
Users can place pre-execution scripts in the host machine directory.
Alternatively, users can upload their own pre-execution scripts via the file management in DataFlux Func:
1.3 Restarting Func and Verifying
After completing all preparations, restart DataFlux Func.
Execute the previous script again, and you will see that the opencv-python
library can now be imported correctly:
2. Pre-execution Script Execution Details
Every time DataFlux Func starts, it first checks if there are any pre-execution scripts.
When pre-execution scripts exist, DataFlux Func executes them in order based on the script names, and DataFlux Func will only start normally if all scripts execute successfully.
To observe the execution process of pre-execution scripts, you can use the following command to track:
View Container Logs | |
---|---|
1 |
|
Using the pre-execution script from above as an example, the output will be as follows:
Execution Logs | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
As you can see, the pre-execution script executed correctly and installed the required dependency packages.
3. Replacing Python Packages in the Image
In some cases, the package version required by the user's script is incompatible with the package already included in the Func image.
In such cases, the "pre-execution script" can also be used to replace the package, for example:
Install the Latest Version of simplejson Package | |
---|---|
1 |
|
Since the "pre-execution script" is executed before Func starts, the package installed via PIP not only affects the user's script but also the entire DataFlux Func.
Therefore, when using this method, please ensure that these operations do not affect the operation of DataFlux Func itself.