Skip to content

Script Development / Installing Third-party Packages

After installing DataFlux Func, since DataFlux Func itself requires third-party packages, these packages can be used directly in scripts without additional installation.

The specific list of pre-installed third-party packages can be viewed in the "Manage / PIP Tools" section, where those marked as Yes in the "Built-in" column are the pre-installed third-party packages.

If users need to install additional third-party packages, they can also do so in the "Manage / PIP Tools" section.

1. Installing Python Packages via the Public Network

If the DataFlux Func that requires extra third-party packages can directly access the public network, then it can use the built-in "PIP Tool" for installation.

1.1 Enable "PIP Tool"

Enable "PIP Tool" in the "Manage / Experimental Features" section.

enable-pip-tool.png

1.2 Install Third-party Packages

In the "PIP Tool", you can select a Pypi mirror, enter the package name, and install it.

If you need to specify a version number, you can use the format package==1.2.3 for installation.

install-package.png

Selecting a Mirror

If your server is located within mainland China, due to some reasons, direct access to the official Pypi might be slow or even impossible. In such cases, you can choose different PIP mirrors.

Mirror Measured Download Speed Notes
Douban Mirror 10.4 MB/s Recommended when installing large packages
Tsinghua Mirror 1.17 MB/s Default mirror used by PIP Tool
Generally recommended due to reliability
Alibaba Cloud Mirror 187 KB/s Obvious throttling
Previously interrupted for months; generally not recommended

The above table was tested on 2023-02-20 with DataFlux Func located on Alibaba Cloud ECS.

2. Installing Python Packages Without Public Network Access

If the DataFlux Func server cannot access the external network for various reasons, the usual PIP method cannot install third-party packages.

In this case, another DataFlux Func instance with the same architecture and public network access is needed to install the third-party packages. After installation, copy the entire extra-python-packages directory to the DataFlux Func that has no public network access.

In the following sections, the DataFlux Func with public network access will be referred to as "Online Func," while the one without public network access will be referred to as "Offline Func."

2.1 Install Third-party Packages on Online Func

Refer to the section above titled "Installing packages from Pypi when public network access is available."

2.2 Enable "File Manager Module"

Enable the "File Manager Module" in the "Manage / Experimental Features" section.

enable-file-manager.png

2.3 Download Third-party Package Directory from Online Func

Find the extra-python-packages directory (all additionally installed Python packages are stored here), select "More / Compress," and download the zip file.

zip-pkg.png

download-pkg.png

2.4 Upload Third-party Package Directory to Offline Func

Go to "Offline Func," delete the original third-party package storage directory, upload the zip file downloaded in the previous step, and extract it.

delete-pkg.png

upload-pkg.png

unzip-pkg.png

unzip-pkg-done.png

2.5 Restart Offline Func

Refer to Deployment and Maintenance / Upgrade and Restart

Notes

DataFlux Func runs inside Docker, and the Python runtime environment depends on the container's environment.

Therefore, do not copy Python installations from your local machine or other non-DataFlux Func containers into DataFlux Func. This often leads to strange issues.

Additionally: "Online Func" and "Offline Func" must ensure identical hardware architectures, such as both being x86_64 or aarch64.

Uploading Wheel packages to the server and installing them directly via PIP theoretically works, but in practice, it usually fails in most cases.

The reason lies in the fact that the uploaded Wheel package may depend on other packages. Simply uploading the Wheel package directly used in the code does not satisfy the dependency relationships. The more powerful the package, the more likely it is to depend on other packages with deeper dependency levels. This could ultimately trap the operator in a dependency hell.

Therefore, unless the package to be installed does not depend on any other packages, it is generally not recommended to install Python third-party packages by uploading Wheel packages.