Skip to content

Script Development / Install Third-Party Packages

After DataFlux Func is installed, since DataFlux Func itself requires third-party packages, these packages can be used directly in scripts without additional installation.

The specific list of pre-installed third-party packages can be viewed in 'Manage / PIP Tool' , where the 'Built-in' column marked as Yes indicates pre-installed third-party packages.

If users need to install additional third-party packages, they can also do so in 'Manage / PIP Tool' .

1. Install Python Packages from the Internet

If the DataFlux Func that requires additional third-party packages can directly access the internet, you can use the built-in 'PIP Tool' to install them.

1.1 Enable 'PIP Tool'

Enable the 'PIP Tool' in 'Manage / Experimental Features' .

enable-pip-tool.png

1.2 Install Third-Party Packages

In the 'PIP Tool' , you can select a Pypi mirror, enter the package name, and install it.

If you need to specify a version, you can use the format package==1.2.3 to install.

install-package.png

Select Mirror

If your server is located in mainland China, due to certain reasons, directly accessing the official Pypi may be slow or even fail to connect. In this case, you can choose different PIP mirrors.

Mirror Actual Download Speed Description
Douban Mirror 10.4 MB/s Recommended for installing large packages
Tsinghua Mirror 1.17 MB/s The default mirror used by PIP Tool
Considering its reliability, it is recommended for general use
Alibaba Cloud Mirror 187 KB/s Significant speed limit
Additionally, Alibaba Cloud Mirror was interrupted for several months, so it is not recommended

The above table was tested on 2023-02-20, using DataFlux Func located on Alibaba Cloud ECS

2. Install Python Packages Without Internet Access

If the server where DataFlux Func is located cannot access the internet for various reasons, the usual PIP method cannot be used to install third-party packages.

In this case, you need another DataFlux Func with the same architecture that can access the internet to install third-party packages. After installation, copy the entire extra-python-packages directory to the DataFlux Func that cannot access the internet.

In the following text, the DataFlux Func that can access the internet is referred to as 'Online Func' ; the DataFlux Func that cannot access the internet is referred to as 'Offline Func'

2.1 Install Third-Party Packages on Online Func

This step refers to the above section 'Install Packages from Pypi When Internet Access is Available'

2.2 Enable 'File Service Module'

Enable the 'File Service Module' in 'Manage / Experimental Features' .

enable-file-manager.png

2.3 Download Third-Party Packages Directory from Online Func

Find the extra-python-packages directory (all additional Python packages are stored here), select 'More / Compress' , and download the zip file.

zip-pkg.png

download-pkg.png

2.4 Upload Third-Party Packages Directory to Offline Func

Go to 'Offline Func' , delete the original third-party packages directory, upload the zip file downloaded in the previous step, and extract it.

delete-pkg.png

upload-pkg.png

unzip-pkg.png

unzip-pkg-done.png

2.5 Restart Offline Func

This step refers to Deployment and Maintenance Guide / Upgrade and Restart

Notes

DataFlux Func runs in Docker, and the Python runtime environment also depends on the container environment.

Therefore, do not copy Python installations from your local machine or other non-DataFlux Func containers into DataFlux Func. This often leads to strange issues.

Additionally: 'Online Func' and 'Offline Func' must have the same hardware architecture, such as both x86_64 or aarch64

Uploading Wheel packages to the server and installing them directly via PIP is theoretically feasible, but in most cases, it does not succeed.

The reason is that the uploaded Wheel package itself may depend on other packages. Simply uploading the Wheel package used directly in the code cannot satisfy the dependency relationships. The more powerful the package, the more dependent packages it generally has, and the deeper the dependency hierarchy. This can eventually trap the operator in dependency hell.

Therefore, unless the package to be installed does not depend on any other packages, it is generally not recommended to use the method of uploading Wheel packages to install third-party Python packages.