Skip to content

Configuration Manual for the "Volcengine - Cloud Monitoring" Collector

Before reading this, please read the following first:

Before using this collector, you must install the 'Integration Core Package' and its associated third-party dependency packages.

This collector supports multi-threading by default (with five threads enabled). If you need to change the thread pool size, set the environment variable COLLECTOR_THREAD_POOL_SIZE.

1. Configuration Structure

The configuration structure of this collector is as follows:

Field Type Required Description
targets list Required List of cloud monitoring collection target configurations.
Logical relationship between multiple configurations with the same namespace is "AND".
targets[#].namespace str Required Namespace of the cloud monitoring data to be collected. For example: 'VCM_ECS'
Refer to the appendix for a complete list.
targets[#].subnamespace str Required Subnamespace of the cloud monitoring data to be collected. For example: 'acs_ecs_dashboard'
Refer to the appendix for a complete list.
targets[#].metrics list Required List of names of cloud monitoring metrics to be collected.
Refer to the appendix for a complete list.
targets[#].metrics[#] str Required Metric name

2. Configuration Examples

Specifying Specific Metrics

Collect 2 metrics named CpuTotal and MemoryUsedSpace from ECS.

Python
1
2
3
4
5
6
7
8
9
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['CpuSystem', 'CpuTotal', 'MemoryUsedSpace'],
        },
    ]
}

Matching Metrics with Wildcards

Metric names can use the * wildcard for matching.

In this example, the following metrics will be collected:

  • Metrics named MemoryUsedSpace
  • Metrics starting with Cpu
  • Metrics ending with Connection
  • Metrics containing Used
Python
1
2
3
4
5
6
7
8
9
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['MemoryUsedSpace', 'Cpu*', '*Connection', '*Used*'],
        },
    ],
}

Excluding Certain Metrics

Adding "NOT" at the beginning indicates that the subsequent metrics should be excluded.

In this example, the following metrics will not be collected:

  • Metrics named MemoryUsedSpace
  • Metrics starting with Cpu
  • Metrics ending with Connection
  • Metrics containing Used
Python
1
2
3
4
5
6
7
8
9
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['NOT', 'MemoryUsedSpace', 'Cpu*', '*Connection', '*Used*'],
        },
    ],
}

Multi-level Filtering for Desired Metrics

The same namespace can be specified multiple times, filtering metrics sequentially from top to bottom based on metric names.

In this example, it is equivalent to performing the following filtering steps on the metric names:

  1. Select all metrics whose names contain Cpu
  2. Exclude metrics named CpuTotal from the results of the previous step
Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['*Cpu*'],
        },
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['NOT', 'CpuTotal'],
        },
    ],
}

Configuring Filters (Optional)

This collector script supports user-defined filters, allowing users to filter target resources based on object attributes. The filter function returns True|False.

  • True: The target resource needs to be collected.
  • False: The target resource does not need to be collected.

When custom object collection is enabled, more object attributes will be supported for filtering. Refer to the corresponding product's custom object collector documentation for details (supported...).

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Example: Enable filters to filter objects based on InstanceId and RegionId, with the following configuration format:

def filter_instance(instance, namespace='VCM_ECS'):
    '''
    Collect metrics for instances with InstanceId i-xxxxxa, i-xxxxxb and RegionId cn-hangzhou
    '''
    instance_id = instance['tags'].get('InstanceId')
    status = instance['tags'].get('Status')
    if instance_id in ['i-xxxxxa', 'i-xxxxxb'] and status in ['RUNNING']:
        return True
    return False

from guance_integration__runner import Runner
import guance_volcengine_monitor__main as main

@DFF.API('Volcengine-monitor ', timeout=3600, fixed_crontab="*/5 * * * *")
def run():
    Runner(main.DataCollector(account, collector_configs, filters=[filter_instance])).run()

When configuring multiple filters under the same namespace, only resources that meet all filter conditions will be reported.

3. Data Reporting Format

After data synchronization, the data can be viewed in the 'Metrics' section of TrueWatch.

For example, consider the following collector configuration:

Python
1
2
3
4
5
6
7
8
9
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['CpuTotal', 'MemoryUsedSpace'],
        },
    ],
}

Example of reported data:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "ResourceID": "i-xxxx"
  },
  "fields": {
    "CpuTotal"       : 1.23,
    "MemoryUsedSpace": 1.23
  }
}

All metric values are reported as float types.

4. Coordination with Custom Object Collectors

When other custom object collectors (such as ECS, mysql) are running within the same DataFlux Func, this collector will automatically attempt to match fields such as tags.ResourceID with the tags.name field in custom objects.

Since knowledge of custom object information is required to coordinate with cloud monitoring collectors, it is generally recommended to place the cloud monitoring collector at the end of the list, such as:

Python
1
2
3
4
5
# Create collectors
collectors = [
  main.DataCollector(account, collector_configs),
  monitor_main.DataCollector(account, monitor_configs)
]

Upon successful matching, all fields except name from the matched custom object tags will be added to the tags of the monitoring data, thereby achieving effects such as filtering cloud monitoring metric data using instance names. Specific effects are as follows:

Assume the original data collected by cloud monitoring is as follows:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "ResourceID": "i-xxxx",
    "{other fields}": "{omitted}"
  },
  "fields": {
    "{metric}": "{metric value}"
  }
}

At the same time, the custom object data collected by the Volcengine ECS collector is as follows:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "name"      : "i-xxxx",
    "InstanceId": "i-xxxx",
    "RegionId"  : "cn-shanghai",
    "{other fields}": "{omitted}"
  },
  "fields": {
    "{other fields}": "{omitted}"
  }
}

Thus, the final cloud monitoring data reported is as follows:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
"measurement": "volcengine_VCM_ECS",
  "tags": {
    "instanceId": "i-xxxx",
    "RegionId"  : "cn-beijing",
    "{other fields}": "{omitted}"
  },
  "fields": {
    "{metric}": "{metric value}"
  }
}

5. Cloud Monitoring API Call Rate Limitations

  1. Volcengine imposes rate limiting on GetMetricData API calls: A main account and its IAM sub-accounts can call the GetMetricData interface no more than 20 times per second; otherwise, rate limiting will be triggered. (Currently, this is just rate limiting without any charges.)

  2. How to avoid API call rate limiting caused by simultaneous execution of multiple auto-triggered tasks: Since the collector supports multi-threaded collection by default and there may be multiple auto-triggered tasks running concurrently, it's easy to trigger API rate limits. Below are two recommendations to mitigate API rate limiting:

  3. Set a smaller value for the environment variable COLLECTOR_THREAD_POOL_SIZE;

  4. Delay the execution of auto-triggered tasks to stagger API calls at the same moment. Add the delayed_crontab parameter in the function decorator DFF.API(xxx). The delayed_crontab parameter specifies the delay in seconds. For example, the following configuration executes the task every minute at the 5th second:
    Python
    1
    2
    3
    4
    5
    6
    7
    @DFF.API('Volcengine-ECS Collection', timeout=3600, fixed_crontab='* * * * *', delayed_crontab=5)
    def run():
        collectors = [
            main.DataCollector(account, collector_configs),
            monitor_main.DataCollector(account, monitor_configs),
        ]
        Runner(collectors,guance_id='observer').run()
    

Note: The above suggestions should be adjusted based on the task execution duration to find suitable parameters.

X. Appendix

Please refer to the official Volcengine documentation: