Skip to content

Troubleshooting / Containers Not Running Properly

1. Containers Restarting Repeatedly in Docker Stack Environment

This issue is generally caused by incorrect configuration, firewall settings, or various whitelist configurations.

Specific manifestations:

  1. Unable to open the page using a browser.
  2. When using the sudo docker ps -a command to view the container list, it is found that the containers are restarting repeatedly.
  3. Using curl http://localhost:8088 on the deployment server returns the error curl: (7) Failed to connect to localhost port 8088: Connection refused.
  4. Error stack information is continuously output in the log files.

Possible causes and solutions:

Possible Cause Solution
Manual configuration changes contain errors Check the modified configuration files, verify aspects like YAML syntax and database connection information correctness.
External servers are specified in the configuration but the network is unreachable Check firewall settings, Alibaba Cloud security group configurations, database connection whitelist configurations, etc.
Compatibility issues with the operating system See below Compatibility Issue with OS
Redis does not support the current system's page size See below Unsupported System Page Size

Compatibility Issue with OS

If using docker logs {Server Container ID} reveals the following or similar error:

Text Only
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
node[9]: ../src/node_platform.cc:61:std::unique_ptr<long unsigned int> node::WorkerThreadsTaskRunner::DelayedTaskScheduler::Start(): Assertion `(0) == (uv_thread_create(t.get(), start_thread, this))' failed.
 1: 0xb57f90 node::Abort() [node]
 2: 0xb5800e  [node]
 3: 0xbc915e  [node]
 4: 0xbc9230 node::NodePlatform::NodePlatform(int, v8::TracingController*, v8::PageAllocator*) [node]
 5: 0xb1b3d1 node::InitializeOncePerProcess(int, char**, node::InitializationSettingsFlags, node::ProcessFlags::Flags) [node]
 6: 0xb1bc89 node::Start(int, char**) [node]
 7: 0x7f2ca389fd90  [/lib/x86_64-linux-gnu/libc.so.6]
 8: 0x7f2ca389fe40 __libc_start_main [/lib/x86_64-linux-gnu/libc.so.6]
 9: 0xa93f0e _start [node]
Aborted (core dumped)

This may be due to incompatibility between the current operating system / components and Docker (e.g., DataFlux Func 2.x comes with Docker 20.10.8, which may have issues on the latest OS versions).

You can try the following solutions:

Upgrade DataFlux Func to the latest version. During the upgrade process, allow the installation script to also upgrade the Docker version. For details, see Deployment and Maintenance / Upgrade and Restart / Upgrade System.

If the latest version of DataFlux Func still has the above issues, users can also download the official Docker binary package themselves to upgrade to a newer Docker version.

Visit the official Docker download page: https://download.docker.com/linux/static/stable/

Or the Alibaba Cloud mirror site: https://mirrors.aliyun.com/docker-ce/linux/static/stable/

For example, on Ubuntu, use the following commands to upgrade OS components:

Bash
1
2
3
sudo apt update
sudo apt upgrade
sudo apt dist-upgrade

Unsupported System Page Size

The official Redis image may throw the error <jemalloc>: Unsupported system page size when starting on some ARM-based operating systems. Refer to:

2. Containers Missing in Docker Stack Environment

This issue is generally caused by an incorrect runtime environment.

Specific manifestations:

  1. Executing sudo docker stack ls shows dataflux-func.
  2. Executing sudo docker ps -a does not show the corresponding containers.
  3. Executing sudo docker stack ps dataflux-func --no-trunc reveals abnormal container statuses.

Possible causes and solutions:

Possible Cause Solution
Snap version of Docker installed on the system Uninstall the Snap version of Docker, reinstall Docker from official sources, or use the Docker that comes with the script.
Other Investigate based on the ERROR column in the output of sudo docker stack ps dataflux-func --no-trunc.

A typical example is no space left on device, indicating insufficient disk space.

3. Containers Failing to Start in k8s Environment

This issue is generally caused by host / k8s cluster problems.

The following error might appear in k8s:

Text Only
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
Events:
  Type     Reason   Age                   From     Message
  ----     ------   ----                  ----     -------
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Warning  Failed   36m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Normal   Created  35m (x5 over 118d)    kubelet  Created container func-server
  Warning  Failed   35m                   kubelet  Error: failed to start container "func-server": Error response from daemon: OCI runtime create failed: container_linux.go:367: starting container process caused: process_linux.go:495: container init caused: rootfs_linux.go:60: mounting "/home/cce/kubelet/pods/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/volume-subpaths/user-config/func-server/1" to rootfs at "/home/cce/docker/overlay2/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/merged/data/user-config-template.yaml" caused: no such file or directory: unknown
  Normal   Pulled   33m (x6 over 118d)    kubelet  Container image "dataflux-func.com/dataflux-func:2.7.0" already present on machine
  Warning  BackOff  2m8s (x157 over 36m)  kubelet  Back-off restarting failed container

The following error might appear in a specific Func service:

Text Only
1
2
3
4
5
6
Traceback (most recent call last):
File "get-config.py", line 11, in <module>
CONFIG = yaml_resource.load_config(os.path.join(BASE_PATH, './config.yaml'))
File "/usr/src/app/worker/utils/yaml_resources.py", line 83, in load_config
user_config_content = _f.read()
OSError: [Errno 5] Input/output error

This is not a DataFlux Func issue. Please check the host / k8s cluster. If NAS is involved, also check for any issues with the NAS.