Logging guidelines

These are guidelines related to the development of Confluence. The guidelines mainly apply to Atlassian employees, but reading them should provide insight for third-party plugin developers as well, so we decided to make them public.

The Purpose of Logging

Logging is an important function of any software and for Confluence, benefits the following groups of people in the manners described:

The Confluence server administrator -- provides detailed information that lets them know if their Confluence server is running correctly
Atlassian support -- provides details and evidence of possible problems to help resolve customer issues
Developers -- allows them to trace code execution without attaching a debugger

When you write any code in Confluence, you should ask yourself what you would want to see, from the point of view of the above three people. If you were running a server, what would you want to be notified of? If you were handling a support case regarding some problem with the system, what information would you need? If you were tracing a bug in the code, what would you want to see logged?

Loggers

Confluence uses SLF4J as a logging client.

1
2
private static final Logger log = LoggerFactory.getLogger(TheCurrentClass.class);

The current practice in Confluence is to create a static final logger called log (lower-case) in each class that needs to perform logging. This may need to be addressed in the future. You will find some old code still uses Category.getInstance() or Logger.getLogger() instead of LoggerFactory.getLogger(). Log4j is being phased out, and should be phased out as you find it.

There is a special log called startupLog in ConfluenceConfigurationListener, which defaults to logging at INFO level, and is used to print out informational messages on system startup. Use this definition for startup messages:

1
2
private final static Logger startupLog = LoggerFactory.getLogger("com.atlassian.confluence.lifecycle");

Log Levels

DEBUG Fine-grained information about what is going on within the system.
INFO Announcements about the normal operation of the system - scheduled jobs running, services starting and stopping, significant user-triggered processes
WARN Any condition that, while not an error in itself, may indicate that the system is running sub-optimally
ERROR A condition that indicates something has gone wrong with the system

Default Log Level

The standard Confluence log (level WARN) is a way for Confluence to communicate with the server administrator. Anything we log to this file will worry the administrator in some way. Logging at WARN level and higher should be reserved for situations that require some kind of attention from the server administrator, and for which corrective action is possible.

We should assume that any time Confluence logs a WARN level message or higher, the customer will create a support case, and will expect us to provide a solution that prevents that message from happening again.

Parameterised Logging and Performance

SLF4J provides parameterised logging capabilities. Parameterised logging allows you to specify parameters in your log statement which will be evaluated only if the log is actually processed. Parameters are indicated in the logging statement with the string {}:

1
2
log.debug("Processed Page {} in Space {}", page, space);

This allows a more readable syntax than concatenating strings. It also yields better performance as the string will only be concatenated if required (which in the example above, will occur only if DEBUG logging is enabled). This means that you do not need to wrap logging statements in an ifDebugEnabled() statement, unless of course you need to do some work to create the parameters you are logging. You should still avoid logging inside a tight loop and if you have to make sure it is at the debug level.

Context

You can rarely put too much context in a log message - avoid only just indicating that an operation failed. Indicate what operation you were attempting when it failed, what object you were trying to act on, and why. Remember to log the exception, so a stack-trace can be logged.

We are not using parameterized logging here, as this is not possible when explicitly logging an exception

Bad:

1
2
log.error("Unable to save page: " + exception.getMessage());

Better:

1
2
log.error("Unable to save " + page + ": " + exception.getMessage(), exception);

Even better (assuming this information is available and relevant in the context. I am making this one up):

1
2
log.error("Unable to save " + page + " in updateMode=" + 
          isAutomaticSaveBoolean + " upon moving it from space " + 
          oldSpace + " to space "+newSpace+": " + exception.getMessage(), exception);

The Mapped Diagnostic Context

The slf4j MDC allows you to add contextual information to a ThreadLocal, which is then included with subsequent log messages. This is a useful tool when you need to add the same contextual information to a lot of log messages.

For example, we currently use the MDC in the LoggingContextFilter to add the request URL and current user to the logging context for every web request.

Example usage:

1
2
while (objectsToIndex.hasNext())
{
    Searchable obj = (Searchable) objectsToIndex.next();

    try
    {
        MDC.put("Indexing", obj.toString());
        index(obj);
    }
    finally
    {
        MDC.remove("Indexing");
    }

Logging Risks

Logging has the potential to introduce Heisenbugs into your code, because the act of turning logging on to look for an error changes the code-paths around where the error is occurring. For example, logging often calls toString() on the objects it wants to log information about. In turn, the toString() method of an object may have unexpected side-effects, for example causing some lazy-loaded Hibernate property to be loaded in from the database.

Also, make sure to check that an object can't be null when calling a method on it. Especially when reporting errors, do go the extra mile to query for null, e.g. like this:

1
2
String previousPageOutdatedId="[previous page variable is null]";
if (previousPage!=null) {
    previousPageOutdatedId=previousPage.getOutdatedId();
}
log.error("Unable to revert to previous page: {}", previousPageOutdatedId);

instead of

1
2
log.error("Unable to revert to previous page {}", previousPage.getOutdatedId());

Logging progress

It is absolutely vital that some kind of logging is done during a long running task. At least at the start and at the end of it. Usually, a loop will call a single operation very often. Make sure that - depending on how long a single call takes - you log each 10th, 100th, 1000th operation. If possible add the complete number of operations that will have to be performed, e.g. in the form " executing delete user (2000/5301)"