Logging levels
I'm using logback in my current project.
It offers six levels of logging: TRACE DEBUG INFO WARN ERROR OFF
I'm looking for a rule of thumb to determine the log level for common activities. For instance, if a thread is locked, should the log message be set to the debug level or the info level. Or if a socket is being used, should its specific id be logged at the debug level or the trace level.
I will appreciate answers with more examples for each logging level.
I mostly build large scale, high availability type systems, so my answer is biased towards looking at it from a production support standpoint; that said, we assign roughly as follows:
error : the system is in distress, customers are probably being affected (or will soon be) and the fix probably requires human intervention. The "2AM rule" applies here- if you're on call, do you want to be woken up at 2AM if this condition happens? If yes, then log it as "error".
warn : an unexpected technical or business event happened, customers may be affected, but probably no immediate human intervention is required. On call people won't be called immediately, but support personnel will want to review these issues asap to understand what the impact is. Basically any issue that needs to be tracked but may not require immediate intervention.
info : things we want to see at high volume in case we need to forensically analyze an issue. System lifecycle events (system start, stop) go here. "Session" lifecycle events (login, logout, etc.) go here. Significant boundary events should be considered as well (eg database calls, remote API calls). Typical business exceptions can go here (eg login failed due to bad credentials). Any other event you think you'll need to see in production at high volume goes here.
debug : just about everything that doesn't make the "info" cut... any message that is helpful in tracking the flow through the system and isolating issues, especially during the development and QA phases. We use "debug" level logs for entry/exit of most non-trivial methods and marking interesting events and decision points inside methods.
trace : we don't use this often, but this would be for extremely detailed and potentially high volume logs that you don't typically want enabled even during normal development. Examples include dumping a full object hierarchy, logging some state during every iteration of a large loop, etc.
As or more important than choosing the right log levels is ensuring that the logs are meaningful and have the needed context. For example, you'll almost always want to include the thread ID in the logs so you can follow a single thread if needed. You may also want to employ a mechanism to associate business info (eg user ID) to the thread so it gets logged as well. In your log message, you'll want to include enough info to ensure the message can be actionable. A log like " FileNotFound exception caught" is not very helpful. A better message is "FileNotFound exception caught while attempting to open config file: /usr/local/app/somefile.txt. userId=12344."
There are also a number of good logging guides out there... for example, here's an edited snippet from JCL (Jakarta Commons Logging):
My approach, i think coming more from an development than an operations point of view, is:
I answer this coming from a component-based architecture, where an organisation may be running many components that may rely on each other. During a propagating failure, logging levels should help to identify both which components are affected and which are a root cause.
ERROR - This component has had a failure and the cause is believed to be internal (any internal, unhandled exception, failure of encapsulated dependency... eg database, REST example would be it has received a 4xx error from a dependency). Get me (maintainer of this component) out of bed.
WARN - This component has had a failure believed to be caused by a dependent component (REST example would be a 5xx status from a dependency). Get the maintainers of THAT component out of bed.
INFO - Anything else that we want to get to an operator. If you decide to log happy paths then I recommend limiting to 1 log message per significant operation (eg per incoming http request).
For all log messages be sure to log useful context (and prioritise on making messages human readable/useful rather than having reams of "error codes")
A nice way to visualise the above logging levels is to imagine a set of monitoring screens for each component. When all running well they are green, if a component logs a WARNING then it will go orange (amber) if anything logs an ERROR then it will go red.
In the event of an incident you should have one (root cause) component go red and all the affected components should go orange/amber.
链接地址: http://www.djcxy.com/p/36826.html上一篇: 覆盖logback配置
下一篇: 记录级别