Troubleshooting / Containers Not Running Properly
1. Containers Restarting Repeatedly in Docker Stack Environment
This issue is generally caused by incorrect configuration, firewall settings, or various whitelist configurations.
Specific manifestations:
- Unable to open the page using a browser.
- When using the
sudo docker ps -acommand to view the container list, it is found that the containers are restarting repeatedly. - Using
curl http://localhost:8088on the deployment server returns the errorcurl: (7) Failed to connect to localhost port 8088: Connection refused. - Error stack information is continuously output in the log files.
Possible causes and solutions:
| Possible Cause | Solution |
|---|---|
| Manual configuration changes contain errors | Check the modified configuration files, verify aspects like YAML syntax and database connection information correctness. |
| External servers are specified in the configuration but the network is unreachable | Check firewall settings, Alibaba Cloud security group configurations, database connection whitelist configurations, etc. |
| Compatibility issues with the operating system | See below Compatibility Issue with OS |
| Redis does not support the current system's page size | See below Unsupported System Page Size |
Compatibility Issue with OS
If using docker logs {Server Container ID} reveals the following or similar error:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 | |
This may be due to incompatibility between the current operating system / components and Docker (e.g., DataFlux Func 2.x comes with Docker 20.10.8, which may have issues on the latest OS versions).
You can try the following solutions:
Upgrade DataFlux Func to the latest version. During the upgrade process, allow the installation script to also upgrade the Docker version. For details, see Deployment and Maintenance / Upgrade and Restart / Upgrade System.
If the latest version of DataFlux Func still has the above issues, users can also download the official Docker binary package themselves to upgrade to a newer Docker version.
Visit the official Docker download page: https://download.docker.com/linux/static/stable/
Or the Alibaba Cloud mirror site: https://mirrors.aliyun.com/docker-ce/linux/static/stable/
For example, on Ubuntu, use the following commands to upgrade OS components:
| Bash | |
|---|---|
1 2 3 | |
Unsupported System Page Size
The official Redis image may throw the error <jemalloc>: Unsupported system page size when starting on some ARM-based operating systems. Refer to:
- CSDN: Handling errors when starting the official Redis container on arm64
- Github issue: docker-library/redis/issues/208
2. Containers Missing in Docker Stack Environment
This issue is generally caused by an incorrect runtime environment.
Specific manifestations:
- Executing
sudo docker stack lsshowsdataflux-func. - Executing
sudo docker ps -adoes not show the corresponding containers. - Executing
sudo docker stack ps dataflux-func --no-truncreveals abnormal container statuses.
Possible causes and solutions:
| Possible Cause | Solution |
|---|---|
| Snap version of Docker installed on the system | Uninstall the Snap version of Docker, reinstall Docker from official sources, or use the Docker that comes with the script. |
| Other | Investigate based on the ERROR column in the output of sudo docker stack ps dataflux-func --no-trunc. |
A typical example is no space left on device, indicating insufficient disk space.
3. Containers Failing to Start in k8s Environment
This issue is generally caused by host / k8s cluster problems.
The following error might appear in k8s:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 | |
The following error might appear in a specific Func service:
| Text Only | |
|---|---|
1 2 3 4 5 6 | |
This is not a DataFlux Func issue. Please check the host / k8s cluster. If NAS is involved, also check for any issues with the NAS.