Troubleshooting / Container Fails to Run Properly
1. Container Restarts Repeatedly in Docker Stack Environment
This issue is generally caused by incorrect configuration, firewall settings, or various whitelist configurations.
Specific manifestations:
- Unable to open the page using a browser.
- When checking the container list with the
sudo docker ps -acommand, it is found that the container is restarting repeatedly. - Using
curl http://localhost:8088on the deployment server returns the errorcurl: (7) Failed to connect to localhost port 8088: Connection refused. - Error stack information is continuously output in the log files.
Possible causes and solutions:
| Possible Cause | Solution |
|---|---|
| Manual configuration changes contain errors | Check the modified configuration files, verify correctness such as YAML syntax, database connection information. |
| Configuration specifies an external server, but the network is unreachable | Check firewall settings, Alibaba Cloud security group configurations, database connection whitelist, etc. |
| Compatibility issues with the operating system | See below Compatibility issues with the operating system. |
| Redis does not support the current system's page size | See below Redis does not support the current system's page size. |
Compatibility issues with the operating system
If using docker logs {Server container ID} reveals the following or similar error:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 11 | |
This may be due to incompatibility between the current operating system / components and Docker (e.g., DataFlux Func 2.x comes with Docker 20.10.8, which may have issues on the latest OS versions).
You can try the following methods to resolve:
Upgrade DataFlux Func to the latest version. During the upgrade process, please allow the installation script to also upgrade the Docker version. For details, see Deployment and Maintenance / Upgrade and Restart / Upgrade System.
If the latest version of DataFlux Func still has the above issue, users can also download the official Docker binary package themselves to upgrade to a newer Docker version.
Download from the official Docker site: https://download.docker.com/linux/static/stable/
Or from the Alibaba Cloud mirror site: https://mirrors.aliyun.com/docker-ce/linux/static/stable/
For example, on Ubuntu, you can use the following commands to upgrade OS components:
| Bash | |
|---|---|
1 2 3 | |
Redis does not support the current system's page size
The official Redis image may encounter the <jemalloc>: Unsupported system page size error when starting on some ARM-based operating systems. See:
- CSDN: Handling errors when starting the official Redis container on arm64
- GitHub issue: docker-library/redis/issues/208
2. Container Does Not Exist in Docker Stack Environment
This issue is generally caused by an incorrect runtime environment.
Specific manifestations:
- Executing
sudo docker stack lsshowsdataflux-func. - Executing
sudo docker ps -adoes not show the corresponding container. - Executing
sudo docker stack ps dataflux-func --no-truncreveals the container status is abnormal.
Possible causes and solutions:
| Possible Cause | Solution |
|---|---|
| Snap version of Docker is installed on the system | Uninstall the Snap version of Docker, reinstall Docker from official channels, or use the Docker that comes with the script. |
| Other | Troubleshoot based on the ERROR column in the output of sudo docker stack ps dataflux-func --no-trunc. |
A typical example is no space left on device, indicating insufficient disk space.
3. Container Fails to Start in k8s Environment
This issue is generally caused by host / k8s cluster problems.
The following error may exist in k8s:
| Text Only | |
|---|---|
1 2 3 4 5 6 7 8 9 10 | |
The following error may exist in a specific Func service:
| Text Only | |
|---|---|
1 2 3 4 5 6 | |
This is not a DataFlux Func issue. Please check the host / k8s cluster. If NAS is involved, also check if there are any issues with the NAS.