Script Development / Install Third-Party Packages
After DataFlux Func is installed, since DataFlux Func itself requires third-party packages, these packages can be used directly in scripts without additional installation.
The specific list of pre-installed third-party packages can be viewed in 'Manage / PIP Tool' , where the 'Built-in' column marked as Yes
indicates pre-installed third-party packages.
If users need to install additional third-party packages, they can also do so in 'Manage / PIP Tool' .
1. Install Python Packages from the Internet
If the DataFlux Func that requires additional third-party packages can directly access the internet, you can use the built-in 'PIP Tool' to install them.
1.1 Enable 'PIP Tool'
Enable the 'PIP Tool' in 'Manage / Experimental Features' .
1.2 Install Third-Party Packages
In the 'PIP Tool' , you can select a Pypi mirror, enter the package name, and install it.
If you need to specify a version, you can use the format package==1.2.3
to install.
Select Mirror
If your server is located in mainland China, due to certain reasons, directly accessing the official Pypi may be slow or even fail to connect. In this case, you can choose different PIP mirrors.
Mirror | Actual Download Speed | Description |
---|---|---|
Douban Mirror | 10.4 MB/s | Recommended for installing large packages |
Tsinghua Mirror | 1.17 MB/s | The default mirror used by PIP Tool Considering its reliability, it is recommended for general use |
Alibaba Cloud Mirror | 187 KB/s | Significant speed limit Additionally, Alibaba Cloud Mirror was interrupted for several months, so it is not recommended |
The above table was tested on 2023-02-20, using DataFlux Func located on Alibaba Cloud ECS
2. Install Python Packages Without Internet Access
If the server where DataFlux Func is located cannot access the internet for various reasons, the usual PIP method cannot be used to install third-party packages.
In this case, you need another DataFlux Func with the same architecture that can access the internet to install third-party packages. After installation, copy the entire extra-python-packages
directory to the DataFlux Func that cannot access the internet.
In the following text, the DataFlux Func that can access the internet is referred to as 'Online Func' ; the DataFlux Func that cannot access the internet is referred to as 'Offline Func'
2.1 Install Third-Party Packages on Online Func
This step refers to the above section 'Install Packages from Pypi When Internet Access is Available'
2.2 Enable 'File Service Module'
Enable the 'File Service Module' in 'Manage / Experimental Features' .
2.3 Download Third-Party Packages Directory from Online Func
Find the extra-python-packages
directory (all additional Python packages are stored here), select 'More / Compress' , and download the zip file.
2.4 Upload Third-Party Packages Directory to Offline Func
Go to 'Offline Func' , delete the original third-party packages directory, upload the zip file downloaded in the previous step, and extract it.
2.5 Restart Offline Func
This step refers to Deployment and Maintenance Guide / Upgrade and Restart
Notes
DataFlux Func runs in Docker, and the Python runtime environment also depends on the container environment.
Therefore, do not copy Python installations from your local machine or other non-DataFlux Func containers into DataFlux Func. This often leads to strange issues.
Additionally: 'Online Func' and 'Offline Func' must have the same hardware architecture, such as both x86_64
or aarch64
3. Why is it Not Recommended to Install Python Packages Directly by Uploading Wheel Packages?
Uploading Wheel packages to the server and installing them directly via PIP is theoretically feasible, but in most cases, it does not succeed.
The reason is that the uploaded Wheel package itself may depend on other packages. Simply uploading the Wheel package used directly in the code cannot satisfy the dependency relationships. The more powerful the package, the more dependent packages it generally has, and the deeper the dependency hierarchy. This can eventually trap the operator in dependency hell.
Therefore, unless the package to be installed does not depend on any other packages, it is generally not recommended to use the method of uploading Wheel packages to install third-party Python packages.