Skip to content

Troubleshooting / Container Fails to Run Properly

1. Container Restarts Repeatedly in Docker Stack Environment

This issue is generally caused by incorrect configuration, firewall settings, or various whitelist configurations.

Specific manifestations:

  1. Unable to open the page using a browser.
  2. When checking the container list with the sudo docker ps -a command, it is found that the container is restarting repeatedly.
  3. Using curl http://localhost:8088 on the deployment server returns the error curl: (7) Failed to connect to localhost port 8088: Connection refused.
  4. Error stack information is continuously output in the log files.

Possible causes and solutions:

Possible Cause Solution
Manual configuration changes contain errors Check the modified configuration files, verify correctness such as YAML syntax, database connection information.
Configuration specifies an external server, but the network is unreachable Check firewall settings, Alibaba Cloud security group configurations, database connection whitelist, etc.
Compatibility issues with the operating system See below Compatibility issues with the operating system.
Redis does not support the current system's page size See below Redis does not support the current system's page size.

Compatibility issues with the operating system

If using docker logs {Server container ID} reveals the following or similar error:

Text Only
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
node[9]: ../src/node_platform.cc:61:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed.
 1: 0xb57f90 node::Abort() [node]
 2: 0xb5800e  [node]
 3: 0xbc915e  [node]
 4: 0xbc9230 node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node]
 5: 0xb1b3d1 node::InitializeOncePerProcess(int, char**, node::InitializationSettingsFlags, node::ProcessFlags::Flags) [node]
 6: 0xb1bc89 node::Start(int, char**) [node]
 7: 0x7f2ca389fd90  [/lib/x86_64-linux-gnu/libc.so.6]
 8: 0x7f2ca389fe40 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
 9: 0xa93f0e _start [node]
Aborted (core dumped)

This may be due to incompatibility between the current operating system / components and Docker (e.g., DataFlux Func 2.x comes with Docker 20.10.8, which may have issues on the latest OS versions).

You can try the following methods to resolve:

Upgrade DataFlux Func to the latest version. During the upgrade process, please allow the installation script to also upgrade the Docker version. For details, see Deployment and Maintenance / Upgrade and Restart / Upgrade System.

If the latest version of DataFlux Func still has the above issue, users can also download the official Docker binary package themselves to upgrade to a newer Docker version.

Download from the official Docker site: https://download.docker.com/linux/static/stable/

Or from the Alibaba Cloud mirror site: https://mirrors.aliyun.com/docker-ce/linux/static/stable/

For example, on Ubuntu, you can use the following commands to upgrade OS components:

Bash
1
2
3
sudo apt update
sudo apt upgrade
sudo apt dist-upgrade

Redis does not support the current system's page size

The official Redis image may encounter the <jemalloc>: Unsupported system page size error when starting on some ARM-based operating systems. See:

2. Container Does Not Exist in Docker Stack Environment

This issue is generally caused by an incorrect runtime environment.

Specific manifestations:

  1. Executing sudo docker stack ls shows dataflux-func.
  2. Executing sudo docker ps -a does not show the corresponding container.
  3. Executing sudo docker stack ps dataflux-func --no-trunc reveals the container status is abnormal.

Possible causes and solutions:

Possible Cause Solution
Snap version of Docker is installed on the system Uninstall the Snap version of Docker, reinstall Docker from official channels, or use the Docker that comes with the script.
Other Troubleshoot based on the ERROR column in the output of sudo docker stack ps dataflux-func --no-trunc.

A typical example is no space left on device, indicating insufficient disk space.

3. Container Fails to Start in k8s Environment

This issue is generally caused by host / k8s cluster problems.

The following error may exist in k8s:

Text Only
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Normal   Created  35m (x5 over 118d)    kubelet  Created container func-server
  Warning  Failed   35m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Normal   Pulled   33m (x6 over 118d)    kubelet  Container image "dataflux-func.com/dataflux-func:2.7.0" already present on machine
  Warning  BackOff  2m8s (x157 over 36m)  kubelet  Back-off restarting failed container

The following error may exist in a specific Func service:

Text Only
1
2
3
4
5
6
Traceback (most recent call last):
File "get-config.py", line 11, in <module>
CONFIG = yaml_resource.load_config(os.path.join(BASE_PATH, './config.yaml'))
File "/usr/src/app/worker/utils/yaml_resources.py", line 83, in load_config
user_config_content = _f.read()
OSError: [Errno 5] Input/output error

This is not a DataFlux Func issue. Please check the host / k8s cluster. If NAS is involved, also check if there are any issues with the NAS.