Skip to content

Disk Usage Inspection

Background

"Disk Usage Inspection" is based on the disk anomaly analyzer, which regularly performs intelligent inspections of host disks. It conducts root cause analysis by identifying hosts with disk anomalies to determine the disk mount points and disk information corresponding to the anomaly time points, analyzing whether there are any disk usage issues with the current workspace hosts.

Prerequisites

  1. Offline deployment of self-built DataFlux Func
  2. Enable the [Script Market] in self-built DataFlux Func (../script-market-basic-usage/)
  3. Create an API Key for operations in the TrueWatch "Manage / API Key Management"
  4. Install the "Self-built Inspection Core Package", "Algorithm Library", and "Self-built Inspection (Disk Usage)" through the "Script Market" in your self-built DataFlux Func
  5. Write a self-built inspection processing function in your self-built DataFlux Func
  6. Create a scheduled task (Old version: Automatic Trigger Configuration) for the written function through "Manage / Scheduled Tasks (Old version: Automatic Trigger Configuration)" in your self-built DataFlux Func

If considering using a cloud server for offline deployment of DataFlux Func, please ensure it is deployed with the currently used TrueWatch SaaS in the same provider and region

Configure Inspection

Create a new script set in your self-built DataFlux Func to enable disk usage inspection configuration

Python
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
from guance_monitor__register import self_hosted_monitor
from guance_monitor__runner import Runner
import guance_monitor_disk_usage__main as disk_usage_check

# Account Configuration
API_KEY_ID  = 'wsak_xxxxx'
API_KEY     = 'wsak_xxxxx'

# The function filters parameter acts as a filter and has priority over configurations in studio monitoring\intelligent inspection. If the function filters parameter is configured, then there's no need to change detection settings in studio monitoring\intelligent inspection. If both are configured, the script's filters parameter takes precedence.

def filter_host(host):
    '''
    Filter hosts, define conditions for hosts that meet requirements, return True for matches, False for non-matches
    return True|False
    '''
    if host == "iZuf609uyxtf9dvivdpmi6Z":
        return True

'''
Task configuration parameters should use:
@DFF.API('Self-built Disk Usage Inspection', fixed_crontab='0 */6 * * *', timeout=900)

fixed_crontab: Fixed execution frequency «every 6 hours»
timeout: Task execution timeout duration, controlled within 15 minutes
'''

@self_hosted_monitor(API_KEY_ID, API_KEY)
@DFF.API('Disk Usage Inspection', fixed_crontab='0 */6 * * *', timeout=900)
def run(configs={}):
    '''
    Parameters:
    configs : List of hosts to be inspected (optional, defaults to inspecting all hosts under the current workspace if not configured)

    Example:
        configs = {
            "hosts": ["localhost"]
        }
    '''
    checkers = [
        disk_usage_check.DiskUsageCheck(configs=configs, filters=[filter_host]), # This is just an example
    ]

    Runner(checkers, debug=False).run()

Start Inspection

After configuring the inspection in DataFlux Func, you can test it by selecting the run() method directly from the page. After publishing, you can view and configure it in DataFlux Func "Manage/Scheduled Tasks".

View Events

This inspection will scan the disk usage information of the last 14 days. Once it detects that the usage will exceed the warning threshold within the next 48 hours, the intelligent inspection will generate corresponding events. These events can be viewed in the "Event Center."

Event Details

  • Event Overview: Describes the object and content of the abnormal inspection event.
  • Anomaly Details: You can view the past 14 days' usage rate of the current abnormal disk.
  • Anomaly Analysis: Displays information about the abnormal host, disk, and mount point to help analyze specific problems.

Common Issues

1. How to configure the detection frequency of the disk usage inspection

  • In your self-built DataFlux Func, when writing the self-built inspection processing function, add fixed_crontab='0 */6 * * *', timeout=900 in the decorator, then configure it in "Manage/Scheduled Tasks (Old version: Automatic Trigger Configuration)."

2. No anomaly analysis may appear when the disk usage inspection is triggered

When there is no anomaly analysis in the inspection report, please check the data collection status of the current datakit.

3. Scripts that were previously running normally show errors during the inspection process

Update the referenced script sets in the Script Market of DataFlux Func. You can view the update records of the Script Market through the Change Log to facilitate timely updates to the scripts.