Interpretation of "Monitor" Logs
2024-03-04
In Guance and TrueWatch, the term "Monitor" actually refers to scheduled tasks in DataFlux Func (Automata). Viewing "Monitor" logs means checking the logs for these "scheduled tasks (old version: automatic trigger configurations)."
For details on how to view "scheduled task (old version: automatic trigger configuration)" logs, please refer to Deployment and Maintenance / System Metrics and Task Records / Task Records
1. Basic Format
Each line of the "Monitor" log follows this format:
Time | Difference from previous log entry (milliseconds) |
Total time since task start to current log entry (milliseconds) |
Module | Content |
---|---|---|---|---|
[03-25 11:04:05] | [+1ms] | [64ms] | [Function] | Function call: guance__api_impl.custom_check |
Specific Example | |
---|---|
1 |
|
Each line of the "Monitor" log follows this format:
Time | Difference from previous log entry (milliseconds) |
Module | Content |
---|---|---|---|
[2024-03-06 20:58:05.088] | [+1ms] | [Function] | Function call: guance__api_impl.custom_check |
Specific Example | |
---|---|
1 |
|
2. Log Reduction Options
Since the logic of monitors has become increasingly complex, resulting in longer and harder-to-read logs, by default, Guance and TrueWatch output concise logs after the 2024-03-27 iteration while meeting basic troubleshooting needs.
If you wish to output full logs, you can create specific environment variables to enable detailed logging for Guance and TrueWatch:
Environment Variable | Data Type | Value |
---|---|---|
ENABLE_DETAILED_GUANCE_LOG |
Boolean | Enabled: true Disabled: false |
3. Fixed Format Log Blocks
Certain processes use a fixed format to output log blocks.
3.1 [KODO] DQL Query Logs
When executing DQL queries within monitors, the Kodo component's API must be called. Each DQL query records a log, as shown below:
Specific Example | |
---|---|
1 2 3 |
|
The logs record:
- DQL query time range
- Specific API method and path used
Specific Example | |
---|---|
1 2 3 4 5 6 7 8 9 |
|
The logs record:
- DQL query time range
- Specific API method, path, request body
- Original response from the Kodo API
3.2 [Studio] Studio Inner API Call Logs for Guance, TrueWatch
To obtain business data from Guance or TrueWatch monitors, calls to the Studio Inner API are required. Each Inner API call records a log, as shown below:
Specific Example | |
---|---|
1 |
|
The logs record:
- Specific API method, path, and request body
Specific Example | |
---|---|
1 2 3 4 5 |
|
The logs record:
- Specific API method, path, and request body
- Original response from the Guance, TrueWatch Studio Inner API
4. Complete Example Analysis
Log content may change after each iteration
Due to new features being added or modifications made to address issues found in previous logs,
the detailed specifics of the logs may vary slightly after each iteration.
Below is an annotated complete example
In the following logs:
Lines starting with #
are explanatory notes, not part of the original logs.
Additional blank lines have been inserted for readability; there are no blank lines in the original logs.
Other content is directly from the logs.
```text title="Log Interpretation" hl_lines="1 5 8 23 27 43 56 65 69 77 85 99 103 112 117-119 136-137 149 154 163-164 168"
Counting 'task scheduling' in Guance and TrueWatch
[04-01 03:43:00] [+100ms] [100ms] [Usage Quota] Data query range is 15 minutes, not exceeding 15 minutes, no additional measurement needed
[04-01 03:43:00] [+0ms] [100ms] [Usage Quota] workspace_uuid
parameter exists, value is wksp_xxxxx, need to measure once
Current workspace information
[04-01 03:43:00] [+1ms] [101ms] [Studio] Workspace information (from cache): {"declaration":{"b":["asfawfgajfasfafgafwba","asfgahjfaf"],"business":"aaa","organization":"64fe7b4062f74d0007b46676"},"isJobDisabled":false,"isSMSDisabled":false,"language":"en","name":"[Doris] Development Testing Together_","token":"tkn_xxxxx"}
ID and parameter list of the function executed in this task
[04-01 03:43:00] [+1ms] [103ms] [Function] Function call: guance__api_impl.custom_check
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: checker
="custom_metric"
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: kwargs
={"version":"v2"}
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: targets
=[{"alias":"Result","dql":"M::
fake_data_for_test:(avg(
field_int)) {
tag= 'fake-data-1' } BY
tag","queryType":"dql","range":900}]
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: channels
=["chan_xxxxx"]
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: extra_data
={"type":"simpleCheck"}
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: checker_opt
={"id":"rul_xxxxx","infoEvent":false,"label":["xxxxx_test"],"message":"Content: xxxxx-Monitor (Single) {df_dimension_tags}\n1: {{ (Result * 100) | to_int }}\n2: {{ Result | to_int * 100 }}","name":"Title: xxxxx-Monitor (M, Single) {df_dimension_tags}","noDataAction":"noData","noDataInterval":120,"noDataMessage":"","noDataTitle":"","recoverInterval":120,"rules":[{"conditionLogic":"and","conditions":[{"alias":"Result","operands":["0"],"operator":">="}],"status":"critical"}],"title":"Title: xxxxx-Monitor (M, Single) {df_dimension_tags}"}
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: monitor_opt
={"id":"monitor_xxxxx","name":"default"}
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: workspace_uuid
="wksp_xxxxx"
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: workspace_token
="tkn_xxxxx"
[04-01 03:43:00] [+0ms] [103ms] [Function] --> Parameter: disable_check_end_time
=false
[04-01 03:43:00] [+0ms] [104ms] [Function] --> Parameter: at_accounts
=null
[04-01 03:43:00] [+0ms] [104ms] [Function] --> Parameter: at_accounts_nodata
=null
Monitor frequency configuration
[04-01 03:43:00] [+2ms] [106ms] [Monitor] Calculating detection interval based on actual Crontab (/1 * * * ) [04-01 03:43:00] [+0ms] [106ms] [Monitor] --> Detection interval: 60 seconds
Query recent data within the user-configured no-data range and two previous time ranges (2 DQL queries)
[04-01 03:43:00] [+0ms] [106ms] [Monitor] ----------------- Loading Gap / New Object Information ------------------ [04-01 03:43:00] [+0ms] [106ms] [Monitor] No-data range configured: 120 seconds [04-01 03:43:00] [+0ms] [106ms] [Monitor] Querying last round data: T - (Detection Frequency 60 seconds) - (No-data Range 120 seconds) - (3x Redundant No-data Range 360 seconds) ~ T - (No-data Range 120 seconds) [04-01 03:43:00] [+0ms] [106ms] [KODO] Executing DQL query -> Time range: 2024-04-01 03:33:00 ~ 2024-04-01 03:40:00, up to 20 pages [04-01 03:43:00] [+0ms] [106ms] [KODO] --> Page 1 (soffset = 0 ~ 500) [04-01 03:43:00] [+0ms] [106ms] [KODO] Calling KODO API -> POST /v1/query [04-01 03:43:00] [+14ms] [121ms] [Studio] Metric unit (from cache): wksp_xxxxx/fake_data_for_test => {"_DFF_CACHE_EXPIRE_TIME":1711914240} [04-01 03:43:00] [+0ms] [121ms] [KODO] --> Unpacking DQL result data: {"metric_units":{"field_int":null},"query_time_range":[1711913580000,1711914000000],"series":[{"columns":["time","avg(field_int)"],"name":"fake_data_for_test","tags":{"tag":"fake-data-1"},"values":[["2024-03-31T19:39:50Z",56.08730158730159]]}]} [04-01 03:43:00] [+0ms] [121ms] [Monitor] Querying this round data: T - (No-data Range 120 seconds) ~ T [04-01 03:43:00] [+0ms] [122ms] [KODO] Executing DQL query -> Time range: 2024-04-01 03:40:00 ~ 2024-04-01 03:42:00, up to 20 pages [04-01 03:43:00] [+0ms] [122ms] [KODO] --> Page 1 (soffset = 0 ~ 500) [04-01 03:43:00] [+0ms] [122ms] [KODO] Calling KODO API -> POST /v1/query [04-01 03:43:00] [+18ms] [140ms] [Studio] Metric unit (from cache): wksp_xxxxx/fake_data_for_test => {"_DFF_CACHE_EXPIRE_TIME":1711914240} [04-01 03:43:00] [+0ms] [140ms] [KODO] --> Unpacking DQL result data: {"metric_units":{"field_int":null},"query_time_range":[1711914000000,1711914120000],"series":[{"columns":["time","avg(field_int)"],"name":"fake_data_for_test","tags":{"tag":"fake-data-1"},"values":[["2024-03-31T19:41:50Z",49.69444444444444]]}]}
Based on the queried data from the two time ranges, determine if there is a data gap or re-reporting, and generate corresponding [No-data Event] or [No-data Recovery Event]
[04-01 03:43:00] [+0ms] [140ms] [Monitor] ----------------- Gap / New Object Load Results ------------------ [04-01 03:43:00] [+0ms] [140ms] [Monitor] --> Last round existing objects: {"tag":"fake-data-1"} [04-01 03:43:00] [+0ms] [141ms] [Monitor] --> This round existing objects: {"tag":"fake-data-1"} [04-01 03:43:00] [+0ms] [141ms] [Monitor] ----> Data gap objects (Last round exists -> This round does not exist): None [04-01 03:43:00] [+0ms] [141ms] [Monitor] --------------------- Determine Data Gap --------------------- [04-01 03:43:00] [+0ms] [141ms] [Monitor] --> No data gap objects [04-01 03:43:00] [+0ms] [141ms] [Monitor] ------------------- Determine Data Recovery from Gap -------------------- [04-01 03:43:00] [+0ms] [141ms] [Monitor] --> Object: {"tag":"fake-data-1"} [04-01 03:43:00] [+4ms] [146ms] [Monitor] Fault cycle information (fault_info) for object {"tag":"fake-data-1"}: null [04-01 03:43:00] [+0ms] [146ms] [Monitor] ----> No last no-data event [04-01 03:43:00] [+0ms] [146ms] [Monitor] ----> No active no-data event, no need to generate a no-data recovery event
Determine whether to generate [Alert Event] based on user-configured detection rules
[04-01 03:43:00] [+0ms] [146ms] [Monitor] -------------------- Execute Data Value Detection -------------------- [04-01 03:43:00] [+0ms] [146ms] [Monitor] Query pending detection data [04-01 03:43:00] [+0ms] [146ms] [KODO] Executing DQL query -> Time range: 2024-04-01 03:27:00 ~ 2024-04-01 03:42:00, up to 20 pages [04-01 03:43:00] [+0ms] [146ms] [KODO] --> Page 1 (soffset = 0 ~ 500) [04-01 03:43:00] [+0ms] [146ms] [KODO] Calling KODO API -> POST /v1/query [04-01 03:43:00] [+21ms] [168ms] [Studio] Metric unit (from cache): wksp_xxxxx/fake_data_for_test => {"_DFF_CACHE_EXPIRE_TIME":1711914240} [04-01 03:43:00] [+0ms] [169ms] [KODO] --> Unpacking DQL result data: {"metric_units":{"field_int":null},"query_time_range":[1711913220000,1711914120000],"series":[{"columns":["time","avg(field_int)"],"name":"fake_data_for_test","tags":{"tag":"fake-data-1"},"values":[["2024-03-31T19:41:50Z",54.370370370370374]]}]}
Iterate through all detection objects sequentially and execute detections
[04-01 03:43:00] [+0ms] [169ms] [Monitor] [Detection Object 1/1] {"tag":"fake-data-1"} [04-01 03:43:00] [+2ms] [171ms] [General Threshold Detection] Pending detection data: {'Result': [54.370370370370374]}
Iterate through all configured rules to determine which detection rule matches
[04-01 03:43:00] [+0ms] [171ms] [General Threshold Detection] [Threshold Rule 1/1] critical: Result >= ['0'] [04-01 03:43:00] [+0ms] [171ms] [Condition Check] [Condition 1/1] IF Result (ANY[54.370370370370374]) >= ["0"] [04-01 03:43:00] [+0ms] [171ms] [Condition Check] --> Intermediate result is True, condition relation is AND, continue [04-01 03:43:00] [+0ms] [171ms] [General Threshold Detection] --> Matched successfully, end check [04-01 03:43:00] [+0ms] [172ms] [General Threshold Detection] Threshold rule match result: {"check_data":{"Result":54.370370370370374},"conditions":[{"alias":"Result","operands":["0"],"operator":">="}],"status":"critical"} [04-01 03:43:00] [+0ms] [172ms] [Monitor] --> Detection object: {"tag":"fake-data-1"}: has reached fault conditions
Call Guance, TrueWatch Studio to get alert strategies configured for this monitor
[04-01 03:43:00] [+2ms] [175ms] [Studio] Alert information cache disabled
[04-01 03:43:00] [+0ms] [175ms] [Studio] Calling Studio Inner API -> GET /api/v1/inner/alert_opt/get
[04-01 03:43:00] [+20ms] [195ms] [Studio] Alert configuration (from API): rul_xxxxx => {"_DFF_CACHE_EXPIRE_TIME":1711914360,"alertPolicies":[{"aggClusterFields":[],"aggFields":[],"aggInterval":0,"aggLabels":[],"id":"altpl_xxxxx","minInterval":900,"name":"xxxxx-Alert Policy1","ruleTimezone":"Asia/Shanghai","rules":[{"crontab":"00 09 * * *","crontabDuration":39600,"name":"Custom Notification Config1","targets":[{"name":"xxxxx-WeChat","status":"critical","type":"wechatRobot","webhook":"https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxx"},{"name":"xxxxx-Alert Policy1-Rule1","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"name":"xxxxx-WeChat","status":"warning","type":"wechatRobot","webhook":"https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxx"},{"name":"xxxxx-Alert Policy1-Rule1","status":"warning","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"}],"upgradeTargets":[{"duration":180,"name":"xxxxx-Alert Policy1-critical-3 Minutes Upgrade","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"duration":600,"name":"xxxxx-Alert Policy1-critical-10 Minutes Upgrade","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"duration":600,"name":"xxxxx-Alert Policy1-critical-10 Minutes Upgrade-2","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"}]},{"targets":[{"name":"xxxxx-WeChat","status":"critical","type":"wechatRobot","webhook":"https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxx"},{"name":"xxxxx-Alert Policy1-Rule2","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"name":"xxxxx-WeChat","status":"warning","type":"wechatRobot","webhook":"https://qyapi.weixin.qq.com/cgi-bin/webhook/send?key=xxxxx"},{"name":"xxxxx-Alert Policy1-Rule2","status":"warning","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"}],"upgradeTargets":[{"duration":180,"name":"xxxxx-Alert Policy1-critical-3 Minutes Upgrade","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"duration":600,"name":"xxxxx-Alert Policy1-critical-10 Minutes Upgrade","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"},{"duration":600,"name":"xxxxx-Alert Policy1-critical-10 Minutes Upgrade-2","status":"critical","type":"dingTalkRobot","webhook":"https://oapi.dingtalk.com/robot/send?access_token=xxxxx"}]}],"workspaceUUID":"wksp_xxxxx"}],"silent":[]}
[04-01 03:43:00] [+1ms] [196ms] [Studio] Constant configuration (from cache): envName => {"_DFF_CACHE_EXPIRE_TIME":1711914334,"value":"Test Environment"}
[04-01 03:43:00] [+2ms] [199ms] [Studio] Constant configuration (from cache): UsePublicAlertLink => {"_DFF_CACHE_EXPIRE_TIME":1711914215,"value":false}
[04-01 03:43:00] [+1ms] [200ms] [Studio] Constant configuration (from cache): consoleBaseURL => {"_DFF_CACHE_EXPIRE_TIME":1711914215,"value":"http://testing-ft2x.dataflux.cn"}
Render event titles/content based on user-configured alert templates and event data
[04-01 03:43:00] [+1ms] [202ms] [Text Renderer] Rendering template: Content: xxxxx-Monitor (Single) {df_dimension_tags} 1: {{ (Result * 100) | to_int }} 2: {{ Result | to_int * 100 }} [04-01 03:43:00] [+1ms] [204ms] [Text Renderer] --> Rendering successful. Output: Content: xxxxx-Monitor (Single) {"tag":"fake-data-1"} 1: 5437 2: 5400 [04-01 03:43:00] [+3ms] [207ms] [Text Renderer] Rendering template: Title: xxxxx-Monitor (M, Single) {df_dimension_tags} [04-01 03:43:00] [+1ms] [208ms] [Text Renderer] --> Rendering successful. Output: Title: xxxxx-Monitor (M, Single) {"tag":"fake-data-1"}
Iterate through events/mute rules to determine if each event should be muted
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] [Event 1/1]
Iterate through alert policies to determine which alert policy/rule the event matches
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] [Alert Policy 1/1] xxxxx-Alert Policy1 (altpl_xxxxx)
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] --------------------- Send Event Alert ---------------------
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] [Alert Rule 1/2] Looping by Crontab 00 09 * * *
, each loop lasts 39600 seconds
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] --> Repeat intervals configured but not within repeat interval range
[04-01 03:43:00] [+0ms] [209ms] [Event Alarm] --> Does not meet repeat interval alert, skip
[04-01 03:43:00] [+0ms] [210ms] [Event Alarm] [Alert Rule 2/2] Remaining other intervals
[04-01 03:43:00] [+0ms] [210ms] [Event Alarm] Successfully matched alert rule, need to alert
Read event status duration
[04-01 03:43:00] [+0ms] [210ms] [Event Alarm] -------------------- Event Status Duration -------------------- [04-01 03:43:00] [+1ms] [211ms] [Event Alarm] --> Current event status is critical, clear non-critical status durations [04-01 03:43:00] [+1ms] [212ms] [Event Alarm] --> Start time of current event status critical has been recorded, start time is 2024-03-26 20:01:00
Generate general alert notifications
Iterate through notification targets to check if they are in a mute period
(All alert notification targets under the same alert policy/rule will align their mute periods)
[04-01 03:43:00] [+1ms] [213ms] [Event Alarm] --------------------- General Alert Notifications --------------------- [04-01 03:43:00] [+0ms] [213ms] [Event Alarm] [Alert Notification Target 1/4] dingTalkRobot/xxxxx-Alert Policy1-Rule2 (critical) [04-01 03:43:00] [+0ms] [214ms] [Event Alarm] Matching event status: critical <=> critical [04-01 03:43:00] [+1ms] [215ms] [Event Alarm] --> Last alert at 2024-04-01 03:36:00, mute for 900 seconds. Mute period ends at 2024-04-01 03:51:00 (540 seconds later) [04-01 03:43:00] [+0ms] [215ms] [Event Alarm] ----> Currently in mute period, skip [04-01 03:43:00] [+0ms] [215ms] [Event Alarm] [Alert Notification Target 2/4] wechatRobot/xxxxx-WeChat (critical) [04-01 03:43:00] [+0ms] [215ms] [Event Alarm] Matching event status: critical <=> critical [04-01 03:43:00] [+1ms] [216ms] [Event Alarm] --> Last alert at 2024-04-01 03:36:00, mute for 900 seconds. Mute period ends at 2024-04-01 03:51:00 (540 seconds later) [04-01 03:43:00```text
[0ms] [216ms] [Event Alarm] ----> Currently in mute period, skip [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] [Alert Notification Target 3/4] dingTalkRobot/xxxxx-Alert Policy1-Rule2 (warning) [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] Matching event status: critical <=> warning [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] --> Not met, skip [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] [Alert Notification Target 4/4] wechatRobot/xxxxx-WeChat (warning) [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] Matching event status: critical <=> warning [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] --> Not met, skip
Generate escalation advanced notifications
Iterate through escalation notification targets to check if the escalation time limit has been reached
[04-01 03:43:00] [+0ms] [217ms] [Event Alarm] --------------------- Escalation Alert Notifications --------------------- [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] [Escalation Alert Notification Target 1/3] dingTalkRobot/xxxxx-Alert Policy1-critical-3 Minutes Upgrade (180/critical) [04-01 03:43:00] [+0ms] [217ms] [Event Alarm] Matching event status: critical <=> critical [04-01 03:43:00] [+1ms] [218ms] [Event Alarm] --> Escalation alert already sent at 2024-03-26 20:04:00, no need for alert escalation [04-01 03:43:00] [+0ms] [218ms] [Event Alarm] [Escalation Alert Notification Target 2/3] dingTalkRobot/xxxxx-Alert Policy1-critical-10 Minutes Upgrade (600/critical) [04-01 03:43:00] [+0ms] [218ms] [Event Alarm] Matching event status: critical <=> critical [04-01 03:43:00] [+1ms] [220ms] [Event Alarm] --> Escalation alert already sent at 2024-03-26 20:11:00, no need for alert escalation [04-01 03:43:00] [+0ms] [220ms] [Event Alarm] [Escalation Alert Notification Target 3/3] dingTalkRobot/xxxxx-Alert Policy1-critical-10 Minutes Upgrade-2 (600/critical) [04-01 03:43:00] [+0ms] [220ms] [Event Alarm] Matching event status: critical <=> critical [04-01 03:43:00] [+1ms] [221ms] [Event Alarm] --> Escalation alert already sent at 2024-03-26 20:11:00, no need for alert escalation
Cache events generated for use by the next monitor task
[04-01 03:43:00] [+2ms] [223ms] [Internal DataWay] Cache events
[04-01 03:43:00] [+0ms] [223ms] [Internal DataWay] Cache fault information
[04-01 03:43:00] [+0ms] [224ms] [Internal DataWay] --> Create cache: key=rul_xxxxx-check
, field={"tag":"fake-data-1"}
Write events into Guance, TrueWatch
[04-01 03:43:00] [+2ms] [226ms] [Internal DataWay] Write events [04-01 03:43:00] [+0ms] [226ms] [Internal DataWay] --> [Event 1/1] Title: xxxxx-Monitor (M, Single) {"tag":"fake-data-1"} (event-xxxxx) [04-01 03:43:00] [+1ms] [228ms] [Internal DataWay] Line protocol write data -> POST /v1/write/keyevent, workspace Token: tkn_xxxxx [04-01 03:43:00] [+0ms] [228ms] [Internal DataWay] --> First/1 data example: {"fields":{"df_alert_policy_ids":["altpl_xxxxx"],"df_alert_policy_names":["xxxxx-Alert Policy1"],"df_at_accounts":"[]","df_at_accounts_nodata":"[]","df_channels":"[\"chan_xxxxx\"]","df_check_range_end":1711914120,"df_check_range_start":1711913220,"df_date_range":900,"df_dimension_tags":"{\"tag\":\"fake-data-1\"}","df_event_reason":"Meets the conditions for recognizing faults in the monitor, generating a fault event","df_fault_duration":459840,"df_fault_start_time":1711454280,"df_issue_duration":459840,"df_issue_start_time":1711454280,"df_matched_alert_policy_rules":["xxxxx-Alert Policy1 / -"],"df_message":"Content: xxxxx-Monitor (Single) {\"tag\":\"fake-data-1\"} \n1: 5437 \n2: 5400","df_meta":"Omitted, see corresponding event data for detailed content","df_monitor_checker_name":"Title: xxxxx-Monitor (M, Single) {df_dimension_tags}","df_monitor_checker_value":"54.370370370370374","df_monitor_name":"xxxxx-Alert Policy1","df_title":"Title: xxxxx-Monitor (M, Single) {\"tag\":\"fake-data-1\"}","df_workspace_declaration":"{\"b\":[\"asfawfgajfasfafgafwba\",\"asfgahjfaf\"],\"business\":\"aaa\",\"organization\":\"64fe7b4062f74d0007b46676\"}"},"measurement":"keyevent","tags":{"df_crontab_exec_mode":"crontab","df_event_id":"event-xxxxx","df_fault_id":"event-xxxxx","df_fault_status":"fault","df_label":"[\"xxxxx_test\"]","df_language":"en","df_monitor_checker":"custom_metric","df_monitor_checker_event_ref":"xxxxx","df_monitor_checker_id":"rul_xxxxx","df_monitor_checker_ref":"xxxxx","df_monitor_checker_sub":"check","df_monitor_checker_type":"monitor","df_monitor_id":"altpl_xxxxx","df_monitor_type":"custom","df_site_name":"Test Environment","df_source":"monitor","df_status":"critical","df_sub_status":"critical","df_workspace_name":"[Doris] Development Testing Together_","df_workspace_uuid":"wksp_xxxxx","tag":"fake-data-1"},"timestamp":1711914120} [04-01 03:43:00] [+14ms] [243ms] [Internal DataWay] --> Response: [200 OK] "" [04-01 03:43:00] [+0ms] [243ms] [Studio] Buffer events that need to notify Studio [04-01 03:43:00] [+6ms] [249ms] [Studio] --> Event: Title: xxxxx-Monitor (M, Single) {"tag":"fake-data-1"} (wksp_xxxxx/event-xxxxx)
Notify Guance, TrueWatch Studio with generated events as per user configuration for tracking
(The monitor only notifies here; specific incident handling is implemented by Guance, TrueWatch Studio)
[04-01 03:43:00] [+0ms] [249ms] [Studio] Buffer events that need to notify Studio [04-01 03:43:00] [+3ms] [252ms] [Studio] --> Event: Title: Yiling-Zhou-Monitor (M, Single) {"tag":"fake-data-1"} (wksp_xxxxx/event-xxxxx)
Number of events generated during this detection
[04-01 03:43:00] [+1ms] [253ms] This detection generates 1 monitor event
Log Interpretation | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 |
|