Expand logging filter API to allow returning a LogRecord

adriangb · May 8, 2022, 8:35pm

Currently logging filters are the only way to hook into the logging system to enrich or otherwise modify log records, but they are limited by the fact that you have to modify a log record in place, thus propagating this change in everywhere, even if you only want to apply this change to a specific handler or logger.

The current API for filters is (record: LogRecord) -> bool (well really truthy/falsy return).
I would like to propose that we change the API to be (record: LogRecord) -> bool | LogRecord.

If the filterer returns a log record, it is indicating that that the reference to the log record passed in as an argument should be replaced with this log record (which may be the same instance or a completely new instance) and that logging should continue.

An example filter might look like:

def replace_message(record: logging.LogRecord):
    return logging.LogRecord(
        name=record.name,
        level=record.levelno,
        pathname=record.pathname,
        lineno=record.lineno,
        msg="new message!",
        exc_info=record.exc_info,
        args=(),
    )

You could then apply this filter to only 1 out of 2 handlers, which would result in one handler always logging "new message!" and the other being unaffected by the change.

I’ve already tried implementing this and it only takes a couple (8) LOC to change in logging.py.
The following is my test for the feature (which is currently impossible to express as far as I can tell):

import logging
import io

parent = logging.getLogger("parent")
parent.setLevel(logging.INFO)
child = logging.getLogger("parent.child")
stream_1 = io.StringIO()
stream_2 = io.StringIO()
handler_1 = logging.StreamHandler(stream_1)
handler_1.setLevel(logging.INFO)
handler_2 = logging.StreamHandler(stream_2)
handler_2.setLevel(logging.INFO)
handler_2.addFilter(replace_message)
parent.addHandler(handler_1)
child.addHandler(handler_2)

child.info("original message")
handler_1.flush()
handler_2.flush()

assert stream_1.getvalue() == "original message\n"
assert stream_2.getvalue() == "new message!\n"

This has the most utility in the context of structured logs, replacing the message is just easier to demonstrate.

No existing tests broke with my implementation, so I think this should be backwards compatible.

The main alternatives I can think of:

Put this sort of modification in the Handler or Logger itself. This would require subclassing things, which is a lot less elegant and composable than filters, which already are arranged in a “pipeline” of sorts and can be a bare function.
Create a new thing (not a filter) which has only this API. The main issue with this is that it would expand the number of concepts and methods in the already complex logging module (I think modifying the filter API is less cognitive overhead and complexity, but that may just be my opinion).

gpshead · May 9, 2022, 10:18pm

This looks like a pretty elegant evolution of the API. Even if existing code happened to already return the record as its truthy value, that code would continue to work. So I don’t think there is even a potential compatibility issue.

I suggest filing a CPython issue and proposing a PR including unittest and documentation updates.

adriangb · May 10, 2022, 3:38am

Thank you for the feedback! I created an issue and PR: Expand logging filter API to allow returning a LogRecord · Issue #92592 · python/cpython · GitHub

vsajip · June 19, 2022, 1:59pm

This PR has now been merged, and the issue closed. Thank you, Adrian, for your contribution.

pf_moore · June 19, 2022, 2:22pm

Awesome! It’s little quality of life improvements like this which aren’t particularly publicised, but which are always nice to find in a new release