Deployment and Maintenance / Upload Self-Monitor Data
This article mainly introduces how to configure DataFlux Func Self-Monitor Data.
1. Preface
By default, DataFlux Func's Self-Monitor Data is stored in the local Redis and MySQL.
After heavy use of DataFlux Func, the amount of data such as Metrics, Logs, etc. may be relatively large. In order to limit local memory and disk usage, system Metrics and Task Record Logs saved locally will be reduced, and the total amount saved will also be limited.
You can refer to Deployment and Maintenance / System Metrics and Task Records / Disable Local Func Task Record to disable "Local Func Task Record" and reduce MySQL storage pressure.
If you need to completely record the system Metrics and Task Record Logs generated in DataFlux Func, you can upload the data to the data platform through system settings.
2. Enable "Self-Monitor Data Upload"
The uploaded Task Logs are complete logs and will not be reduced
In Management / System Settings / Self-Monitor Data Upload, users can enable "Self-Monitor Data Upload"
The URL address can be filled with the DataWay or DataKit upload address:
| Text Only | |
|---|---|
1 | |
| Text Only | |
|---|---|
1 | |
In addition, it is generally recommended to fill in the site name to mark the current use of DataFlux Func, such as: "Test Func"
After enabling 'Self-Monitor Data Upload', you can also disable 'Local Func Task Record' to reduce the local DataFlux Func storage pressure
3. View System Metrics and Task Record Logs in the Data Platform
After correctly configuring "Self-Monitor Data Upload", you can view the uploaded system Metrics and Task Logs in the data platform.
4. Uploaded Data Description
DataFlux Func will upload various types of data for troubleshooting.
Self-Monitor may be adjusted at any time
Self-Monitor Data may be adjusted at any time according to the actual situation with version updates. This document only provides the latest Self-Monitor Data description.
Add service filter to improve query efficiency
In the latest version, all Logging Self-Monitor Data will additionally have a service Tag, whose value is the same as the source in Logging, to adapt to the service partition storage of ScopeDB.
Therefore, when querying Func's Logging Self-Monitor Data, adding service:xxxxx to the filter condition source:xxxxx can significantly improve query efficiency.
Delay Queue (Metric)
Task queue that has not yet reached the trigger time
Measurement: DFF_delay_queue
| Field | Type | Description | Example Value |
|---|---|---|---|
queue |
Tag | Queue | "8" |
length |
Field | Queue Length | 100 |
Worker Queue (Metric)
Task queue that has reached the trigger time
Measurement: DFF_worker_queue
| Field | Type | Description | Example Value |
|---|---|---|---|
queue |
Tag | Queue | "8" |
length |
Field | Queue Length | 100 |
worker_count |
Field | Number of Worker Units listening to the queue | 5 |
process_count |
Field | Number of Processes listening to the queue | 25 |
Func Trigger Count (Metric)
Measurement: DFF_func_trigger
| Field | Type | Description | Example Value |
|---|---|---|---|
func_id |
Tag | Func ID | "demo__test.run" |
trigger_count_per_minute |
Field | Triggers per Minute | 100 |
Func Execution Count (Metric)
Measurement: DFF_func_run
| Field | Type | Description | Example Value |
|---|---|---|---|
func_id |
Tag | Func ID | "demo__test.run" |
run_count_per_minute |
Field | Executions per Minute | 100 |
Func Execution Status Count (Metric)
Measurement: DFF_func_status
| Field | Type | Description | Example Value |
|---|---|---|---|
status |
Tag | Func Execution Status | "success" |
status_count_per_minute |
Field | Status Count per Minute | 100 |
Func Execution Cost (Metric)
Measurement: DFF_func_cost
| Field | Type | Description | Example Value |
|---|---|---|---|
wait_cost_sum_per_minute |
Field | Total Wait Cost per Minute (ms) | 10000 |
queue_cost_sum_per_minute |
Field | Total Queue Cost per Minute (ms) | 10000 |
run_cost_sum_per_minute |
Field | Total Execution Cost per Minute (ms) | 10000 |
total_cost_sum_per_minute |
Field | Total Cost per Minute (ms) | 10000 |
cpu_cost_sum_per_minute |
Field | Total CPU Cost per Minute (ms) | 10000 |
non_cpu_cost_sum_per_minute |
Field | Total Non-CPU Cost per Minute (ms) | 10000 |
Cache Database (Metric)
Measurement: DFF_cache_db
| Field | Type | Description | Example Value |
|---|---|---|---|
server |
Tag | Target Database (HOST:PORT/DB) |
"127.0.0.1:6379/5" |
keys |
Field | Number of Keys | 100 |
used_memory |
Field | Memory Usage (bytes) | 10000 |
connected_clients |
Field | Number of Connected Clients | 100 |
uptime |
Field | Service Uptime (seconds) | 60 |
Database Table (Metric)
Measurement: DFF_db_table
| Field | Type | Description | Example Value |
|---|---|---|---|
server |
Tag | Target Database (HOST:PORT/DB) |
"127.0.0.1:3306/dataflux_func" |
name |
Tag | Table Name | "biz_main_func_api" |
rows |
Field | Number of Rows | 10 |
data_size |
Field | Data Size (bytes) | 100 |
index_size |
Field | Index Size (bytes) | 100 |
total_size |
Field | Total Size (bytes) | 200 |
avg_row_size |
Field | Average Row Size (bytes) | 100 |
Cron Job Scheduled (Metric)
Measurement: DFF_cron_job_scheduled
| Field | Type | Description | Example Value |
|---|---|---|---|
scheduled_count_per_week |
Field | Scheduled Count per Week | 604800 |
scheduled_count_per_day |
Field | Scheduled Count per Day | 86400 |
scheduled_count_per_hour |
Field | Scheduled Count per Hour | 3600 |
scheduled_count_per_minute |
Field | Scheduled Count per Minute | 60 |
scheduled_count_per_second |
Field | Scheduled Count per Second | 1 |
Entity (Metric)
Measurement: DFF_entity
| Field | Type | Description | Example Value |
|---|---|---|---|
entity |
Tag | Entity | "funcAPI" |
count |
Field | Count | 100 |
enabled_count |
Field | Enabled Count | 99 |
Self-Monitor Data Upload Panic Level (Metric)
Measurement: DFF_self_monitor_upload_panic_level
| Field | Type | Description | Example Value |
|---|---|---|---|
panic_level |
Field | Panic Level | 10 |
Func Service Info (Logging)
Measurement: DFF_service_info
| Field | Type | Description | Example Value |
|---|---|---|---|
name |
Tag | Service Name | "server", "worker", "beat" |
version |
Tag | Version Number | "7.1.11" |
edition |
Tag | Edition | "GSE" |
hostname |
Tag | Hostname | "web001" |
pid |
Tag | Process PID | 1234 |
uptime |
Field | Service Uptime (seconds) | 60 |
System Task Record / Func Task Record (Logging)
After executing any internal system task or Func task, DataFlux Func will upload the corresponding Task Log, which can be viewed through the Log Explorer.
| Measurement | Description |
|---|---|
DFF_task_record |
System Task Record |
DFF_task_record_func |
Func Task Record |
| Field | Description | System Task Record | Func Task Record |
|---|---|---|---|
source |
Data Source | DFF_task_record |
DFF_task_record_func |
site_name |
Site Name | ||
id |
Task ID | ||
name |
Task Name | ||
queue |
Queue | ||
task_status |
Task Status, possible values see below | ||
root_task_id |
Root Task ID | ||
script_set_id |
Script Set ID | ||
script_id |
Script ID | ||
func_id |
Func ID | ||
func_name |
Func Name | ||
origin |
Origin, possible values see below | ||
origin_id |
Origin ID E.g.: Cron Job ID |
||
script_set_title |
Script Set Title | ||
script_title |
Script Title | ||
func_title |
Func Title | ||
cron_job_exec_mode |
Cron Job Execution Mode | ||
func_call_kwargs |
Func Call Parameters | ||
cron_expr |
Cron Job Crontab Expression | ||
call_chain |
Call Chain | ||
return_value |
Return Value | ||
message |
Task Log | ||
kwargs |
Task Parameters | ||
eta |
Estimated Time of Arrival | ||
delay |
Delay Duration (seconds) | ||
timeout |
Timeout Duration (seconds) | ||
expires |
Expiration Duration (seconds) | ||
ignore_result |
Whether to Ignore Result | ||
result |
Result Content | ||
exception_from |
Exception Source | ||
exception_type |
Exception Type | ||
exception |
Exception Content | ||
origin_exception_type |
Original Exception Type | ||
origin_exception |
Original Exception Content | ||
traceback |
Error Traceback | ||
trigger_time_iso |
Trigger Time (ISO Date Format) | ||
start_time_iso |
Start Time (ISO Date Format) | ||
end_time_iso |
End Time (ISO Date Format) | ||
wait_cost |
Wait Cost (ms) | ||
run_cost |
Execution Cost (ms) | ||
total_cost |
Total Cost (ms) | ||
cpu_cost |
CPU Cost (ms) | ||
cpu_cost_percent |
CPU Cost Percentage | ||
non_cpu_cost |
Non-CPU Cost (ms) | ||
non_cpu_cost_percent |
Non-CPU Cost Percentage | ||
sys_db_query_count |
System DB Query Count | ||
sys_db_query_details |
System DB Query Details | ||
sys_cache_db_query_count |
System Cache Query Count | ||
sys_cache_db_query_details |
System Cache Query Details | ||
status |
Log Status, possible values see below | ||
workspace_uuid |
Workspace ID | ||
df_monitor_checker_id |
Monitor ID | ||
df_monitor_id |
Alert Strategy ID |
5. Related Field Details
Details of some fields are as follows
Fields task_status and status
In the logs uploaded by DataFlux Func, task_status and status have a one-to-one relationship. task_status is the task status description, and status is the status value that complies with the data platform specifications.
The specific correspondence is as follows:
| task_status Value | status Value | Description |
|---|---|---|
success |
ok |
Success |
failure |
critical |
Failure |
skip |
warning |
Task Skipped |
Fields origin and origin_id
The origin and origin_id fields are used to mark the execution origin of the Func task. The specific values are as follows:
| origin Value | Description | origin_id Value Meaning | Remarks |
|---|---|---|---|
funcAPI |
Func API | Func API ID | |
cronJob |
Cron Job | Cron Job ID | |
direct |
Direct Func Call E.g.: Data Platform Studio call within cluster |
Fixed as direct |
|
integration |
Triggered by Script Integration | {Integration Type}.{Start Mode}-{Func ID} |
|
syncAPI |
Sync API | Sync API ID | Replaced by funcAPI in the latest version |
asyncAPI |
Async API | Async API ID | Replaced by funcAPI in the latest version |
authLink |
Auth Link | Auth Link ID | Replaced by syncAPI in the latest version |
crontab |
Auto Trigger Configuration | Auto Trigger Configuration ID | Replaced by cronJob in the latest version |
batch |
Batch | Batch ID | Replaced by asyncAPI in the latest version |