Server Application Error Inspection
Background
When the server encounters runtime errors, we need to detect them early and issue timely warnings so that developers and operations personnel can troubleshoot. It is important to promptly confirm whether these errors have potential impacts on the application. The content of the server application error inspection event notifies developers and operations about new errors that occurred in the last hour, pinpoints the exact location of the error, and provides associated diagnostic clues to users.
Prerequisites
- Applications already integrated with TrueWatch "Application Performance Monitoring"
- Offline deployment of self-hosted DataFlux Func
- Enable the self-hosted DataFlux Func Script Market
- Create an API Key for performing operations in the TrueWatch "Management / API Key Management" section
- In the self-hosted DataFlux Func, install the "Self-built Inspection Core Package", "Algorithm Library", and "Self-built Inspection (APM Errors)" via the "Script Market"
- Write a custom inspection processing function in the self-hosted DataFlux Func
- In the self-hosted DataFlux Func, create scheduled tasks (Old version: Automatic Trigger Configuration) for the written functions through "Management / Scheduled Tasks (Old version: Automatic Trigger Configuration)"
If you consider using a cloud server for offline deployment of DataFlux Func, ensure it is deployed with your currently used TrueWatch SaaS in the same operator and region
Configuring Inspection
Create a new script set in the self-hosted DataFlux Func to enable memory leak inspection configuration.
Python | |
---|---|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|
Enabling Inspection
After configuring the inspection in DataFlux Func, you can test it by selecting and running the run()
method directly from the page. After publishing, you can view and configure the task in the DataFlux Func "Management / Scheduled Tasks".
Viewing Events
This inspection scans for new application errors in the past hour. Once a new type of error occurs, smart inspection generates corresponding events, which can then be viewed in the "Event Center".
Event Details
- Event Overview: Describes the object and content of the anomaly inspection event.
- Error Distribution: View changes in the number of errors that occurred in the last hour for the current anomalous application.
- Error Details: Displays detailed information about new errors and specific error counts for the anomalous application. You can click on specific error messages, error types, and error stacks to navigate to the error detail page.
Common Issues
1. How to configure the detection frequency for server application error inspections
- In the self-hosted DataFlux Func, when writing the custom inspection processing function, add
fixed_crontab='0 * * * *', timeout=1800
in the decorator, then configure it in "Management / Scheduled Tasks (Old version: Automatic Trigger Configuration)".
2. Why might there be no anomaly analysis when the server application error inspection triggers
If there is no anomaly analysis in the inspection report, check the data collection status of the current datakit
.
3. Under what circumstances will server application error inspection events be generated
The server application error inspection scans for new application errors in the last hour. Once a new type of error occurs, the smart inspection generates a corresponding event.
4. What to do if a previously normal script starts producing abnormal errors during inspection
Update the referenced script set in the DataFlux Func Script Market. You can view the update records of the script market through the Change Log to facilitate instant updates of scripts.