Troubleshooting / Containers Not Running Properly
1. Containers Continuously Restarting in Docker Stack Environment
This issue is generally caused by incorrect configurations, firewalls, or various whitelist settings.
Specific manifestations include:
- Unable to open the page using a browser
- When using the
sudo docker ps -a
command to view the container list, it is observed that the container keeps restarting - Using
curl http://localhost:8088
on the deployment server returns the errorcurl: (7) Failed to connect to localhost port 8088: Connection refused
- Error stack information is continuously output in the log files
Possible causes and solutions:
Possible Causes | Solutions |
---|---|
Manual configuration changes with errors | Check the modified configuration files, verify YAML syntax, database connection information, etc. |
External server specified in configuration but network is unreachable | Check firewall, Alibaba Cloud security group settings, database whitelist configurations, etc. |
Compatibility issues with the operating system | See below Compatibility Issues with the Operating System |
Redis does not support the current system's page size | See below Redis Does Not Support the Current System's Page Size |
Compatibility Issues with the Operating System
If using docker logs {Server Container ID}
reveals the following or similar errors:
Text Only | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 |
|
This may be due to incompatibility between the current operating system / components and Docker (e.g., DataFlux Func 2.x comes with Docker 20.10.8, which may have issues on the latest OS versions).
Possible solutions:
Upgrade DataFlux Func to the latest version. During the upgrade, allow the installation script to upgrade Docker as well. For details, refer to Deployment and Maintenance / Upgrade and Restart / Upgrade System
If the latest version of DataFlux Func still has the above issues, users can manually download the official Docker binary package to upgrade Docker.
Download from the official Docker site: https://download.docker.com/linux/static/stable/
Or from Alibaba Cloud mirror site: https://mirrors.aliyun.com/docker-ce/linux/static/stable/
For example, on Ubuntu, use the following commands to upgrade OS components:
Bash | |
---|---|
1 2 3 |
|
Redis Does Not Support the Current System's Page Size
The official Redis image may fail to start on some ARM-based operating systems with the error <jemalloc>: Unsupported system page size
. See:
- CSDN: Handling Redis Official Container Startup Errors on arm64
- Github issue: docker-library/redis/issues/208
2. Containers Missing in Docker Stack Environment
This issue is generally caused by incorrect runtime environments.
Specific manifestations include:
- Executing
sudo docker stack ls
showsdataflux-func
- Executing
sudo docker ps -a
does not show the corresponding container - Executing
sudo docker stack ps dataflux-func --no-trunc
reveals abnormal container status
Possible causes and solutions:
Possible Causes | Solutions |
---|---|
Snap version of Docker installed on the system | Uninstall Snap Docker, reinstall Docker from official sources, or use the Docker included in the script |
Others | Investigate based on the ERROR column in sudo docker stack ps dataflux-func --no-trunc |
A typical example is no space left on device
, indicating insufficient disk space.
3. Containers Failing to Start in k8s Environment
This issue is generally caused by host / k8s cluster problems.
Possible errors in k8s include:
Text Only | |
---|---|
1 2 3 4 5 6 7 8 9 10 |
|
A Func service may have the following error:
Text Only | |
---|---|
1 2 3 4 5 6 |
|
This is not a DataFlux Func issue. Please check the host / k8s cluster. If NAS is involved, also check if there are any issues with the NAS.