Skip to content

Collector Configuration Manual for «Tencent Cloud - Cloud Monitor»

Before reading this article, please read the following first:

Before using this collector, you must install the «Integration Core Package» and its accompanying third-party dependency packages

To collect Tencent Cloud Cloud Monitor data, you must first configure custom object collectors for corresponding products

This collector supports multi-threading by default (five threads are enabled by default). If you need to change the thread pool size, set the environment variable COLLECTOR_THREAD_POOL_SIZE

1. Configuration Structure

The configuration structure of this collector is as follows:

Field Type Required Description
Regions List Required List of cloud monitoring regions to be collected
regions[#] str Required Region ID, such as: ap-shanghai
Refer to appendix for the complete list
targets list Required Cloud monitoring target configuration list
Logical relationship between multiple configurations with the same namespace is «AND»
targets[#].namespace str Required Namespace of cloud monitoring data to be collected. For example: QCE/CVM Refer to appendix for the complete list
targets[#].metrics list Required List of cloud monitoring metric names to be collected
Refer to appendix for the complete list
targets[#].metrics[#] str Required Metric name pattern, supporting "NOT" and wildcard matching
Normally, the logical relationship between multiple patterns is «OR». When a "NOT" marker is included, the logical relationship becomes «AND». See details below

2. Configuration Example

Specifying Specific Metrics

Collect two metrics named WanOuttraffic and WanOutpkg from QCE/CVM

Python
1
2
3
4
5
6
7
8
9
tencentcloud_monitor_configs = {
    'regions': ['ap-shanghai'],
    'targets': [
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['WanOuttraffic', 'WanOutpkg'],
        }
    ]
}

Wildcard Matching Metrics

Metric names can use * wildcards for matching.

In this example, the following metrics will be collected:

  • Metrics named WanOutpkg
  • Metrics whose names start with Wan
  • Metrics whose names end with Outpkg
  • Metrics whose names contain Out
Python
1
2
3
4
5
6
7
8
9
tencentcloud_monitor_configs = {
    'regions': ['ap-shanghai'],
    'targets': [
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['WanOutpkg', 'Wan*', '*Outpkg', '*Out*']
        }
    ]
}

Excluding Certain Metrics

Adding "NOT" at the beginning indicates removing subsequent metrics.

In this example, the following metrics will NOT be collected:

  • Metrics named WanOutpkg
  • Metrics whose names start with Wan
  • Metrics whose names end with Outpkg
  • Metrics whose names contain Out
Python
1
2
3
4
5
6
7
8
9
tencentcloud_monitor_configs = {
    'regions': ['ap-shanghai'],
    'targets': [
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['NOT', 'WanOutpkg', 'Wan*', '*Outpkg', '*Out*']
        }
    ]
}

Multiple Filters for Specified Metrics

The same namespace can be specified multiple times, filtering metrics sequentially from top to bottom.

In this example, it is equivalent to applying the following filtering steps to the metric names:

  1. Select all metrics whose names contain Out

  2. In the results of the previous step, exclude metrics named WanOutpkg

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
tencentcloud_monitor_configs = {
    'regions': ['ap-shanghai'],
    'targets': [
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['*Out*']
        },
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['NOT', 'WanOutpkg']
        }
    ]
}

Configuring Filters (Optional)

This collector script supports user-defined filters, allowing users to screen target resources based on object attributes. The filter function returns True or False.

  • True: Target resources need to be collected.
  • False: Target resources do not need to be collected.

Tencent Cloud Monitoring supports filtering properties consistent with the attribute data of objects such as cloud servers (CVM), cloud databases (CDB, Redis, MongoDB), load balancers (CLB), object storage (COS), etc. For more details, refer to the Tencent Cloud custom object collector documentation.

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
# Example: Enable filters based on object's name and RegionId, with the following configuration format:

def filter_instance(instance, namespace='QCE/COS'):
    '''
    Collect metrics where name is smart-xxxxa, smart-xxxxb and RegionId is ap-nanjing
    '''
    instance_name = instance['name']
    region_id = instance['RegionId']
    if instance_name in ['smart-xxxxa', 'smart-xxxxb'] and region_id in ['ap-nanjing']:
        return True
    return False

from guance_integration__runner import Runner
import guance_tencentcloud_monitor__main as main

def run():
    Runner(main.DataCollector(account, collector_configs, filters=[filter_instance])).run()

When configuring multiple filters under the same namespace, all filters must be satisfied simultaneously for data to be reported

3. Data Collection Instructions

Cloud Product Configuration Information

Product Name Namespace (Namespace) Dimension (Dimension) Description
Cloud Servers QCE/CVM InstanceId vm_uuid, vmUuid, uuid, InstanceId are uniformly recognized as object data's InstanceId
Cloud Database Mysql QCE/CDB InstanceId, InstanceType
Object Storage Monitoring QCE/COS BucketName
Public Load Balancer Monitoring QCE/LB_PUBLIC vip The Address field in object data is recognized as vip
Private Load Balancer Monitoring QCE/LB_PRIVATE vip, vpcId
Cloud Database Redis QCE/REDIS_MEM InstanceId Currently only supports Redis instance monitoring, does not support node monitoring
Cloud Database MongoDB QCE/CMONGO InstanceId Currently only supports MongoDB instance monitoring, does not support replica set or node monitoring

Monitoring Metrics Configuration Information

Currently, the collector only supports collecting instance-level metrics. Users are advised to configure metrics according to the corresponding namespaces.

QCE/CVM

Metric English Name (MetricName) Metric Chinese Name
WanInpkg External network inbound packet count
WanIntraffic External network inbound bandwidth
WanOutpkg External network outbound packet count
WanOuttraffic External network outbound bandwidth
AccOuttraffic External network outbound traffic
BaseCpuUsage Basic CPU usage
CpuLoadavg CPU one-minute average load
CPUUsage CPU utilization
Cpuloadavg5m CPU five-minute average load
Cpuloadavg15m CPU fifteen-minute average load
CvmDiskUsage Disk utilization
LanInpkg Internal network inbound packet count
LanOutpkg Internal network outbound packet count
LanIntraffic Internal network inbound bandwidth
LanOuttraffic Internal network outbound bandwidth
MemUsage Memory utilization
MemUsed Memory used
TcpCurrEstab TCP connection count
TimeOffset UTC time difference between child machine and NTP time
GpuMemTotal Total GPU memory
GpuMemUsage GPU memory usage rate
GpuMemUsed GPU memory used quantity
GpuPowDraw GPU power consumption quantity
GpuPowLimit Total GPU power capacity
GpuPowUsage GPU power usage rate
GpuTemp GPU temperature
GpuUtil GPU usage rate

QCE/CDB

Metric English Name (MetricName) Metric Chinese Name
BytesReceived Internal network inbound traffic
BytesSent Internal network outbound traffic
Capacity Disk space occupied
ComCommit Commit count
ComDelete Delete count
ComInsert Insert count
ComReplace Replace count
ComRollback Rollback count
ComUpdate Update count
ConnectionUseRate Connection utilization rate
CpuUseRate CPU utilization rate
CreatedTmpDiskTables Number of temporary disk tables
CreatedTmpFiles Number of temporary files
CreatedTmpTables Number of temporary memory tables
HandlerCommit Internal commit count
HandlerReadRndNext Next row read request count
HandlerRollback Internal rollback count
InnodbBufferPoolPagesFree Number of free InnoDB pages
InnodbBufferPoolPagesTotal Total number of InnoDB pages
InnodbBufferPoolReadRequests innodb buffer pool pre-read page count
InnodbBufferPoolReads innodb disk read page count
InnodbCacheHitRate innodb cache hit rate
InnodbCacheUseRate innodb cache usage rate
InnodbDataReads Total InnoDB read volume
InnodbDataWrites Total InnoDB write volume
InnodbDataWritten InnoDB write volume
InnodbNumOpenFiles Current number of InnoDB open tables
InnodbOsFileReads innodb disk reads count
InnodbOsFileWrites innodb disk writes count
InnodbOsFsyncs innodbfsync count
InnodbRowLockTimeAvg Average InnoDB row lock time (milliseconds)
InnodbRowLockWaits InnoDB row lock wait count
InnodbRowsDeleted InnoDB rows deleted count
InnodbRowsInserted InnoDB rows inserted count
InnodbRowsRead InnoDB rows read count
InnodbRowsUpdated InnoDB rows updated count
IOPS Input/output per second (or read/write count)
KeyBlocksUnused Number of unused blocks in key cache
KeyBlocksUsed Number of used blocks in key cache
KeyCacheHitRate myisam cache hit rate
KeyCacheUseRate myisam cache usage rate
KeyReadRequests Number of times data blocks are read from key cache
KeyReads Number of times data blocks are read from disk
KeyWriteRequests Number of times data blocks are written to key buffer
KeyWrites Number of times data blocks are written to disk
LogCapacity Log usage volume
MasterSlaveSyncDistance Master-slave delay distance
MaxConnections Maximum connections
MemoryUseRate Memory utilization rate
MemoryUse Memory occupied
OpenFiles Open file count
OpenedTables Number of already opened tables
Qps Operations executed per second
Queries Total access volume
QueryRate Access volume percentage
RealCapacity Disk usage space
SecondsBehindMaster Master-slave delay time
SelectCount Query count
SelectScan Full table scan count
SlaveIoRunning IO thread status
SlaveSqlRunning SQL thread status
SlowQueries Slow query count
TableLocksImmediate Number of immediately released table locks
TableLocksWaited Number of table lock waits
ThreadsConnected Current connection count
ThreadsCreated Number of created threads
ThreadsRunning Number of running threads
Tps Transactions executed per second
VolumeRate Disk utilization rate
InnodbDataRead InnoDB read volume

QCE/COS

Metric English Name (MetricName) Metric Chinese Name
StdReadRequests Standard storage read requests
StdRetrieval Standard data read volume
StdWriteRequests Standard storage write requests
IaRetrieval Low-frequency data read volume
IaWriteRequests Low-frequency storage write requests
IaReadRequests Low-frequency storage read requests
NlWriteRequests Nl write requests
NlRetrieval Nl retrieval volume
CdnOriginTraffic CDN origin traffic
InternetTraffic External network downstream traffic
InternalTraffic Internal network downstream traffic
InboundTraffic Total external and internal upload traffic

QCE/LB_PRIVATE

Metric English Name (MetricName) Metric Chinese Name
ClientConnum Active client-to-LB connections
ClientInactiveConn Inactive client-to-LB connections
ClientConcurConn Concurrent client-to-LB connections
ClientNewConn New client-to-LB connections
ClientInpkg Client-to-LB inbound packet count
ClientOutpkg Client-to-LB outbound packet count
ClientAccIntraffic Client-to-LB inbound traffic
ClientAccOuttraffic Client-to-LB outbound traffic
ClientOuttraffic Client-to-LB outbound bandwidth
ClientIntraffic Client-to-LB inbound bandwidth
DropTotalConns Dropped connections
InDropBits Dropped inbound bandwidth
OutDropBits Dropped outbound bandwidth
InDropPkts Dropped inbound packets
OutDropPkts Dropped outbound packets
IntrafficVipRatio Inbound bandwidth utilization rate
OuttrafficVipRatio Outbound bandwidth utilization rate
UnhealthRsCount Health check anomaly count

QCE/LB_PUBLIC

Metric English Name (MetricName) Metric Chinese Name
ClientConnum Active client-to-LB connections
ClientInactiveConn Inactive client-to-LB connections
ClientConcurConn Concurrent client-to-LB connections
ClientNewConn New client-to-LB connections
ClientInpkg Client-to-LB inbound packet count
ClientOutpkg Client-to-LB outbound packet count
ClientAccIntraffic Client-to-LB inbound traffic
ClientAccOuttraffic Client-to-LB outbound traffic
ClientIntraffic Client-to-LB inbound bandwidth
ClientOuttraffic Client-to-LB outbound bandwidth
DropTotalConns Dropped connections
IntrafficVipRatio Public network inbound bandwidth utilization rate (may not have this metric)
InDropBits Dropped inbound bandwidth
InDropPkts Dropped inbound packets
OuttrafficVipRatio Public network outbound bandwidth utilization rate (may not have this metric)
OutDropBits Dropped outbound bandwidth
OutDropPkts Dropped outbound packets
UnhealthRsCount Health check anomaly count

QCE/REDIS_MEM

Metric English Name (MetricName) Metric Chinese Name
CpuUtil CPU utilization rate
CpuMaxUtil Node maximum CPU utilization rate
MemUsed Memory used amount
MemUtil Memory utilization rate
MemMaxUtil Node maximum memory utilization rate
Keys Total number of keys
Expired Expired keys count
Evicted Evicted keys count
Connections Connection count
ConnectionsUtil Connection utilization rate
InFlow Inbound flow
InBandwidthUtil Inbound flow utilization rate
InFlowLimit Inbound flow throttling triggered
OutFlow Outbound flow
OutBandwidthUtil Outbound flow utilization rate
OutFlowLimit Outbound flow throttling triggered
LatencyAvg Average execution latency
LatencyMax Maximum execution latency
LatencyRead Read average latency
LatencyWrite Write average latency
LatencyOther Other commands average latency
Commands Total requests
CmdRead Read requests
CmdWrite Write requests
CmdOther Other requests
CmdBigValue Large Value requests
CmdKeyCount Key request count
CmdMget Mget request count
CmdSlow Slow queries
CmdHits Read request hits
CmdMiss Read request misses
CmdErr Execution errors
CmdHitsRatio Read request hit rate

QCE/CMONGO

Metric English Name (MetricName) Metric Chinese Name
Reads Number of read requests
Updates Number of update requests
Deletes Number of delete requests
Counts Number of count requests
Success Number of successful requests
Commands Number of command requests
Qps Requests per second
Delay10 Number of requests with latency between 10 - 50 milliseconds
Delay50 Number of requests with latency between 50 - 100 milliseconds
Delay100 Number of requests with latency over 100 milliseconds
ClusterConn Cluster connection count
Connper Connection utilization rate
ClusterDiskusage Disk utilization rate

4. Data Reporting Format

After the data is synchronized normally, you can view the data in the «Metrics» section of TrueWatch.

For example, consider the following collector configuration:

Python
1
2
3
4
5
6
7
8
9
tencentcloud_monitor_configs = {
    'regions': ['ap-shanghai'],
    'targets': [
        {
            'namespace': 'QCE/CVM',
            'metrics'  : ['WanOutpkg']
        }
    ]
}

An example of the reported data is as follows:

JSON
1
2
3
4
5
6
7
8
9
{
  "measurement": "tencentcloud_QCE/CVM",
  "tags": {
    "InstanceId": "i-xxx"
  },
  "fields": {
    "WanOutpkg_max": 0.005
  }
}

All metric values are reported as float type

This collector collects data for the WanOutpkg metric under the QCE/CVM namespace (Namespace). See Data Collection Instructions table for details.

5. Linkage with Custom Object Collectors

When other custom object collectors (such as CVM) are running in the same DataFlux Func, this collector supplements fields based on the dimension information described in Data Collection Instructions. For example, the InstanceId field returned by cloud monitoring data attempts to match the tags.name field in custom objects.

Since custom object information needs to be known first to enable linkage in the cloud monitoring collector, it is generally recommended to place the cloud monitoring collector at the end of the list, such as:

Python
1
2
3
4
5
# Create collectors
collectors = [
    tencentcloud_cvm.DataCollector(account, common_tencentcloud_configs),
    tencentcloud_monitor.DataCollector(account, tencentcloud_monitor_configs) # Cloud monitoring collector usually placed at the end
]

Upon successful matching, additional fields from the matched custom object tags are added to the cloud monitoring data tags, enabling effects such as filtering cloud monitoring metric data using instance names. The specific effect is as follows:

Assume the original cloud monitoring data collected is as follows:

JSON
1
2
3
4
5
6
7
{
  "measurement": "tencentcloud_QCE/CVM",
  "tags": {
    "InstanceId": "i-xxx"
  },
  "fields": { "content omitted" }
}

At the same time, the custom object data collected by the Tencent Cloud CVM collector is as follows:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
{
  "measurement": "tencentcloud_cvm",
  "tags": {
    "name"           : "i-xxx",
    "InstanceType"   : "c6g.xxx",
    "PlatformDetails": "xxx",
    "{other fields omitted}"
  },
  "fields": { "content omitted" }
}

Then, the final cloud monitoring data reported is as follows:

JSON
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
{
  "measurement": "tencentcloud_QCE/CVM",
  "tags": {
    "name"            : "i-xxx",
    "InstanceId"          : "i-xxx",   // Original field from cloud monitoring
    "InstanceType"    : "c6g.xxx", // Field from custom object CVM
    "PlatformDetails" : "xxx",     // Field from custom object CVM
    "{other fields omitted}"
  },
  "fields": { "content omitted" }
}

6. Explanation of Cloud Monitor API Call Limits

Tencent Cloud Cloud Monitor imposes free quota limits on some API call counts (this collector uses GetMonitorData API to request monitoring data, which falls under the limited-free quota API. Each main account has a free request quota of 1 million calls/month. Excess calls are charged at 0.25 RMB/10,000 calls. Additionally, exceeding the free quota will prevent further usage unless "API Pay-as-you-go" is manually activated.) Below is a detailed explanation of the script set call counts:

1. Determining whether users exceed the free quota when they have multiple resources and need to collect various monitoring items:

This collector uses GetMonitorData (querying the latest monitoring data for specified monitoring items) to obtain multiple (up to 10, with pagination for excess) resources for a single monitoring item. Examples of request counts:

  • An account with 10 cvm resources to collect CpuUsage requires 1 request;
  • An account with 10 cvm resources to collect CpuUsage and BaseCpuUsage requires 2 requests (one request per monitoring item);
  • An account with 11 cvm resources to collect CpuUsage requires 2 requests (pagination for resources exceeding 10);
  • An account with 11 cvm resources to collect CpuUsage and BaseCpuUsage requires 4 requests;

2. Finding the actual call count by viewing task execution logs:

The collector records the number of API calls made during each task execution, which can be viewed in the logs, for example:

Bash
1
2
3
4
[2023-04-24 19:02:02.359] [+1156ms] Completed collection for the 【1】st account, total execution time【1155 milliseconds】, during which APIs were called【2 times】
[2023-04-24 19:02:02.360] [+0ms] Detailed calls are as follows
[2023-04-24 19:02:02.360] [+0ms] -> monitor.tencentcloudapi.com/?Action=DescribeBaseMetrics: 1 time
[2023-04-24 19:02:02.565] [+0ms] -> monitor.tencentcloudapi.com/?Action=GetMonitorData: 1 time

Given that there is a free quota for cloud monitoring calls, it is recommended that users configure monitoring items as needed to avoid unnecessary costs caused by wildcard matching

Precautions

Troubleshooting Errors During Task Triggering and Solutions

  1. HTTPClientError: An HTTP Client raised an unhandled exception: SoftTimeLimitExceeded()

Cause: Task execution timeout due to excessive execution time.

Solution:

  • Appropriately increase the task's timeout setting (e.g., @DFF.API('Perform Collection', timeout=120, fixed_crontab="* * * * *"), indicating setting the task's timeout to 120 seconds).

  • [TencentCloudSDKException] code:InvalidParameterValue message:cannot find metricName=xxx configure

Cause: Tencent Cloud does not support the collection of this metric (there may be cases where the metric exists in Tencent Cloud documentation but is actually unsupported).

Solution:

  • It is recommended to refer to the Monitoring Metrics Configuration Information in this article and configure valid metric names.

  • [TencentCloudSDKException] code:InvalidParameterValue message: xxxxx does not belong to the developer ....

Cause: While collecting cloud monitoring data for a certain product under a specific account, the product has been released, causing the interface to throw an error, which can be ignored.

X. Appendix

Tencent Cloud Cloud Monitor

Refer to the official Tencent Cloud documentation: