Script Development / Installing Third-party Packages
After installing DataFlux Func, since DataFlux Func itself requires third-party packages, these packages can be used directly in scripts without additional installation.
The specific list of pre-installed third-party packages can be viewed in the "Manage / PIP Tools" section, where those marked as Yes
in the "Built-in" column are the pre-installed third-party packages.
If users need to install additional third-party packages, they can also do so in the "Manage / PIP Tools" section.
1. Installing Python Packages via the Public Network
If the DataFlux Func that requires extra third-party packages can directly access the public network, then it can use the built-in "PIP Tool" for installation.
1.1 Enable "PIP Tool"
Enable "PIP Tool" in the "Manage / Experimental Features" section.
1.2 Install Third-party Packages
In the "PIP Tool", you can select a Pypi mirror, enter the package name, and install it.
If you need to specify a version number, you can use the format package==1.2.3
for installation.
Selecting a Mirror
If your server is located within mainland China, due to some reasons, direct access to the official Pypi might be slow or even impossible. In such cases, you can choose different PIP mirrors.
Mirror | Measured Download Speed | Notes |
---|---|---|
Douban Mirror | 10.4 MB/s | Recommended when installing large packages |
Tsinghua Mirror | 1.17 MB/s | Default mirror used by PIP Tool Generally recommended due to reliability |
Alibaba Cloud Mirror | 187 KB/s | Obvious throttling Previously interrupted for months; generally not recommended |
The above table was tested on 2023-02-20 with DataFlux Func located on Alibaba Cloud ECS.
2. Installing Python Packages Without Public Network Access
If the DataFlux Func server cannot access the external network for various reasons, the usual PIP method cannot install third-party packages.
In this case, another DataFlux Func instance with the same architecture and public network access is needed to install the third-party packages. After installation, copy the entire extra-python-packages
directory to the DataFlux Func that has no public network access.
In the following sections, the DataFlux Func with public network access will be referred to as "Online Func," while the one without public network access will be referred to as "Offline Func."
2.1 Install Third-party Packages on Online Func
Refer to the section above titled "Installing packages from Pypi when public network access is available."
2.2 Enable "File Manager Module"
Enable the "File Manager Module" in the "Manage / Experimental Features" section.
2.3 Download Third-party Package Directory from Online Func
Find the extra-python-packages
directory (all additionally installed Python packages are stored here), select "More / Compress," and download the zip file.
2.4 Upload Third-party Package Directory to Offline Func
Go to "Offline Func," delete the original third-party package storage directory, upload the zip file downloaded in the previous step, and extract it.
2.5 Restart Offline Func
Refer to Deployment and Maintenance / Upgrade and Restart
Notes
DataFlux Func runs inside Docker, and the Python runtime environment depends on the container's environment.
Therefore, do not copy Python installations from your local machine or other non-DataFlux Func containers into DataFlux Func. This often leads to strange issues.
Additionally: "Online Func" and "Offline Func" must ensure identical hardware architectures, such as both being x86_64
or aarch64
.
3. Why Is It Not Recommended to Install Python Packages Directly by Uploading Wheel Packages?
Uploading Wheel packages to the server and installing them directly via PIP theoretically works, but in practice, it usually fails in most cases.
The reason lies in the fact that the uploaded Wheel package may depend on other packages. Simply uploading the Wheel package directly used in the code does not satisfy the dependency relationships. The more powerful the package, the more likely it is to depend on other packages with deeper dependency levels. This could ultimately trap the operator in a dependency hell.
Therefore, unless the package to be installed does not depend on any other packages, it is generally not recommended to install Python third-party packages by uploading Wheel packages.