Logging attributes standardization

Saphyel · July 8, 2023, 6:24pm

I use the module logging a lot and it’s really annoying to remember all the attributes and the right casing for it.

Some of them are camelcased like funcName, some others oneworded levelname, and some others snakecased stack_info

I don’t really mind which case wins but I prefer it’s consistent.

barry-scott · July 8, 2023, 9:01pm

Can you provide specific examples please?
Are the names for the same type of object that nix the naming style?

jamestwebber · July 8, 2023, 10:27pm

funcName, levelname and stack_info are all real attributes of the LogRecord class. The module uses CamelCase for a lot of stuff that Python programmers usually write in snake case (i.e. lots of methods and functions).

So I have to agree with the OP, it’s a bit surprising/annoying that this module doesn’t follow the style guide that I expect for a module in the stdlib. But it’s about as old as PEP-008 itself, so it isn’t too surprising.

It seems very hard to fix, though–this module is 20 years old and used by an enormous amount of code. Just adding aliases in the correct style would probably be confusing.

Saphyel · July 8, 2023, 11:24pm

So you agree, but you also think it’s best to keep the status quo (so you don’t agree to change it)

I think adding aliases and keeping the legacy ones with a deprecation warning might be a good first step. They don’t have to stick around forever or be dropped immediately.

jamestwebber · July 9, 2023, 12:03am

I agree with you that it’d be nice if the names had been standardized. I’m just not sure the change is worth it. Not that I have any authority on such matters!

It just seems like changing this interface has a huge surface area (20 years of legacy code, and the code most likely to use logging is probably more complex than average), with a pretty small upside: aesthetically more satisfying, and slightly easier for new users to get into because the naming fits their expectations.

I doubt that it would ever be worth dropping the existing names. It’s a ton of churn for code that is working just fine at the moment. Raising DeprecationWarning would be a huge pain for that code too, so it’d probably be a “soft” deprecation ^[1]. So the options are “introduce new, duplicate names” or the status quo. Duplicate names seems confusing to me.

The most realistic way I could imagine this happening is if a whole new logging module came along, the way that argparse replaced optparse (which took many years). But logging would probably stick around even then due to legacy code. And obviously such a change would need more substantial improvement than just conforming to PEP 8.

see this thread for a general discussion of how deprecation currently works in Python ↩︎

srittau · July 9, 2023, 12:17pm

I would also like to see alternatives for old camel case function and attribute names in the stdlib, especially the logging and unittest modules. While the old names could probably never be removed, I don’t see any issue with introducing snake case aliases (or delegating @properties in the case of attributes). This would make user code more consistent and pleasant to read.

effigies · July 9, 2023, 12:26pm

This question came up with unittest recently, and there I think the status quo argument was very strong. unittest is mostly internal and 3rd party tools are not encouraged to use it, so there is no benefit to adding duplicates.

On the other hand, logging is ubiquitous in user code, and it is good for new users to adopt it. Easing that process seems like it should carry some weight. I would personally support duplicate names, along with documentation to make it clear that they are aliases and not different functionality.

IMO the biggest downside would be the drive-by PRs against every repository using the new names, and the severity of that downside is going to vary by maintainer.

jamestwebber · July 9, 2023, 3:34pm

I guess duplicate names aren’t confusing so much as it seems like it’d be a bad UX for development. If I’m in an IDE and I type logging. I’m going to get twice as many suggestions as I want, half of them redundant.

Perhaps this change should wait for the establishment of “soft deprecation” / “obsolete” / whatever in linting tools, with the idea that a smart IDE knows to hide the deprecated names (but still allow their use without a warning, to preserve legacy code).

The logging docs could be re-written to present the new names as the standard with a section on the legacy names explaining that they are “defunct” but will never be removed.

Saphyel · July 10, 2023, 6:27pm

If we want to push this forward what would be the next steps on this?

Rosuav · July 10, 2023, 6:45pm

Convince a core dev that this is worth doing.

ntessore · July 10, 2023, 9:38pm

Is it really a drive-by PR to replace an outdated alias if the function behind the name hasn’t changed at all? Unlike PRs trying to “fix” deprecated-but-stable API with substantial changes, I would consider that helpful.

Rosuav · July 10, 2023, 10:19pm

Only once you’ve decided you’re okay with your code breaking on older Pythons.

ntessore · July 10, 2023, 10:38pm

I understand the problem being hinted at to be PRs which repeatedly try to fix a “wontfix” issue, e.g. replacing optparse by argparse when there’s a good reason for optparse to stay. For a trivial name change, why not ask for the change to be done in a backward-compatible manner the first time an unsolicited “drive-by PR” pops up?

Rosuav · July 10, 2023, 11:02pm

The backward compatible manner is “don’t rename it”. Which is the problem with these trivial name changes - at the start of the overlap period there is absolutely no reason to change any code, meaning that the end of the overlap period is just as much of a problem no matter when it happens.

There are three sane options:

Add aliases in version X, remove them in version Y. No matter how many years away version Y is, it’s going to cause unnecessary breakage by removing names that are entirely valid.
Add aliases but never remove them. This is what the threading module did - you can still use names like threading.currentThread() in modern Pythons, even though aliases threading.current_thread() etc were added in 2008, Python 2.6. The cost of having two names for these functions (with the corresponding confusion in documentation or other lookups) will effectively remain forever.
Don’t add aliases. PEP 8 reminds us that this is a perfectly reasonable approach: backward compatibility is way more important than matching style.

With the first option, those drive-by PRs are going to shift from “annoying guff that has to be rejected” to “essential fixes for an upcoming problem”, with each project seeing the shift at a different point (based on when they stop supporting Pythons prior to the introduction of aliases, and when they start supporting Pythons after the removal of the old names). With the second, those PRs never need to be accepted. And with the third, obviously there’s no PRs and no problem.

There is no entirely good solution. Everything has its costs. I’m in favour of option 3, with option 2 being also viable.

ntessore · July 10, 2023, 11:12pm

I’m firmly in favour of option 2, and it is in this context that I think unsolicited PRs are not a real issue that should be counted against change. Sure, your extensions need to fall back to the old names for as long as you support these versions of Python, but over a short decade or two the ecosystem can and will transition to the new names.

Rosuav · July 10, 2023, 11:27pm

That’s about the timescale for the threading module. And I did a quick search for threading.currentThread and came across this:

The text itself is a bit messed up but you can clearly see that this is a student’s code, and the repository includes some text copied and pasted from the instructions. And the instructions use the old name in the hint. That means, fourteen years after Python 2.6 came out (that repository is a year old), students are being taught using the old names.

When I say “remain forever”, I really truly mean that. Otherwise it’s option 1 with a long gap. The costs of maintaining aliases aren’t huge, but they aren’t zero either:

More code in the stdlib
More tests, making sure to test both aliases properly (you could assert that threading.currentThread is threading.current_thread but that fails once you add a deprecation warning to the old name)
Blog posts, Stack Overflow answers, and forum recommendations, all divided among the different names
Confusion when people correlate different sources and find different names
Confusion when people explore the object and ask “what’s the difference between these?”. This is very real; I had to spend some time looking up the event.srcElement attribute on a JavaScript event object, only to find out that it’s a deprecated alias for event.target.
More difficult code searches - “how extensively is this attribute used?” now requires that you search for both names
Etcetera.

Each one might not seem like a huge cost, but for comparison, neither are the costs of the current situation:

It’s harder to handwrite code without tab completion, as you have to remember (or look up) which name to use
It’s harder to spot errors. If you write dict.fromKeys(...) then you should know for sure that it’s wrong (though - without looking it up - do you know whether it’s from_keys or fromkeys?). Having inconsistent names means that more naming conventions “look right” to your eye.

But at least there’s less confusion.

Saphyel · July 18, 2023, 1:14pm

But I don’t think it should stay forever. it can be added let’s say 3.13 and removed in 3.15.

Any person that doesn’t want to change their legacy code, it can stay up to 3.14 without issues, moving to 3.15 will require them to update their codebase.

I think if we can improve things should be better than leave them as it is because “it just works” otherwise… we could keep using python 1, I mean “it works fine”

Rosuav · July 18, 2023, 1:25pm

Python is not React.js.

srittau · July 19, 2023, 11:49am

Off topic, but React’s last release that broke user-facing API (16.0) was in 2017. In fact, React’s last release (18.2.0) was in June 2022. I’d say at this point React is more stable than Python.

Rosuav · July 19, 2023, 12:01pm

Depends on your definition of “stable”. How many times in the past decade (ie since version 3.0) has Python required that you change your code to keep it running? I’m not counting situations where there’s a new way of doing things, but only when the old way is no longer valid. React has been around for less time than that, and has had quite a number of breaking changes. I know, because I taught React for a few years, and everything changed multiple times, and it’s all different again now.

Python has introduced new features, but not forced people to update their codebases twice a year.