Docs: Standard Library Function Conventions

aeros · July 7, 2019, 5:39am

A recent commit to cpython which added a brief section mentioning the possible exceptions for os.chdir() brought about a few questions with regards to the documentation conventions, specifically on the sections for individual functions within the standard library.

Should existing sections which describe functions as “methods” be adjusted to “functions”?

Within the programming community across a number of different languages, the terms “method” and “function” have to come to roughly refer to the same thing, usually with each language having a specific term. Across the docs and in the Python community in general, “function” seems to be the more commonly used term, but periodically in the documentation for functions, there are sentences phrased as “This method…”. Ultimately this is just a matter of semantics, but for the purposes of documentation, consistency with terminology is important.

What are the conventions for describing exceptions for a function?

In general, it seems be useful for users to be able to quickly refer to the docs to get a basic idea of exceptions which can be raised by a particular function. However, a number of functions (particularly in OS, where the commit was made) in the standard library have no description of exception behavior. This is not to say that every possible exception should be verbosely listed though.

Instead, the docs should probably focus on the most common ones which occur from “normal” usage of the function. For example, this was the section added to os.chdir() in the final commit:

This function can raise :exc:OSError and subclasses such as
:exc:FileNotFoundError, :exc:PermissionError, and :exc:NotADirectoryError.

Note how it does not explicitly state every possible OSError which can occur from the usage of os.chdir().

Where should the exceptions be mentioned?

Another inconsistency which I noticed was the positional location of the exceptions. For the convenience of the users and improved readability, it seems to make the most sense to have them consistently in the same position, on their own line, and separate from the rest of the function summary (instead of buried within another section). It seems a bit unreasonable to require users to go on a “scavenger hunt” to find the common exceptions for a given function.

When adding the exceptions to os.chdir(), I used a format which seemed to be used for a few other functions within os. As an example, here is os.chdir():

Change the current working directory to path.

This function can support :ref:specifying a file descriptor <path_fd>. The
descriptor must refer to an opened directory, not an open file.

This function can raise :exc:OSError and subclasses such as
:exc:FileNotFoundError, :exc:PermissionError, and :exc:NotADirectoryError.

… versionadded:: 3.3
Added support for specifying path as a file descriptor
on some platforms.

… versionchanged:: 3.6
Accepts a :term:path-like object.

In this format, the common exceptions which can be raised are mentioned in a separate section just before the version info.

As an unrelated note to the discussion itself, I would be more than happy to begin working on applying this formatting across the standard library docs if it is approved. Rather than immediately opening individual issues and sending a large volume of PRs, I figured it would be worthwhile to begin by opening a discussion on the topic and receiving feedback from the Python community.

Note: This post has a large number of edits because I accidentally submitted the post early by pressing tab + enter. I meant to do some tab indenting, but instead had selected the submit button.

brettcannon · July 8, 2019, 9:00pm

I would say at least new docs should make the distinction appropriately, but I don’t know if going through the entire stdlib and updating it in one feel swoop makes sense. Might make sense to try it it on a file or two and see how people respond.

The code that explicitly raises the exception should document it. Incidental exceptions are typically left out unless it makes practical sense to call something out. Sorry, but there’s no rule here for “practical”.

Wherever it’s appropriate. I expect people to read the whole documentation (sans examples) to know how to use something, so I don’t expect benefit from sticking exception details in a specific location in all places. Plus at this point it would be a losing battle compared with the amount of documentation already written just for formatting consistency which you won’t get long-term.

aeros · July 8, 2019, 9:27pm

This was my plan actually, in OS in particular to replace the times the term “method” is used with “function”. I completely agree that doing it all at once within the same issue across the whole of stdlib would be a bit overkill.

So in terms to whether or not it should be mentioned in the docs for a given function, it’s situational based on the behavior of the function itself? If that’s what you mean by practical, that makes sense.

From my understanding, many of the functions in the stdlib have fairly intuitive behavior, but there definitely are many which handle specific exceptions in way that is unique or different from similar functions in the module. With the example of os.chdir(), one might intuitively expect it to use os.is_dir() (to validate the path), retain the current working directory, and return None if an invalid one is used. However, this is not the case. Specifying the behavior in the docs seems practical in a case like this.

It’s a different debate altogether whether or not this behavior should occur. In my opinion, the docs should strive to accurately describe how a function currently behaves to provide as much clarity for the users as possible.

Good point, I had not initially considered that the formatting consistency may not be practical to maintain. However, going forward for future changes, is the positioning that I used for the os.chdir() exceptions acceptable?

aeros · July 9, 2019, 11:41am

Upon further consideration and research with the distinction between method vs function, the term method is more commonly used when describing a “function” that is specifically a member of a class and is operating upon an instance of the class. This is usually denoted in Python with the self argument. As an example:

def function():
    ...

class Example:

    def __init__(self):
        ...

    def method(self):
        ...

Using the above definitions, __init__() and method() would be considered methods (although __init()__ might be better described as a constructor or constructor method) and function() would be considered a function.

This definition however is by no means universally accepted. As a result, I would not recommend universally applying it across the Python docs without major consensus from the community.

However, I would propose that anything which fits the above definition for a “function” be renamed from “method” to “function” if it is not already being called a function. I think this would be far less drastic and controversial than blindly changing every mention of method to function. Also, I have yet to see a solid argument anywhere for naming a function that is outside of the scope of a class a “method”.

In the majority of instances, this is already the case, but I’ve definitely seen some occurrences across the docs where this could be corrected. I’ll start the process by doing this on a page-by-page basis (beginning with os.rst, as a trial), and link to this discussion page on the issue tracker for feedback.

As a more long term process, I think there should eventually be some degree of consistency within the docs that clearly defines the difference between the two, but this should be a decent start. Perhaps an official PEP would be appropriate, but I’m not overly familiar with the creation process and I don’t think I’m an experienced enough member of the Python dev community to create one on my own.

brettcannon · July 9, 2019, 9:52pm

Correct.

Seems fine, but that’s based on what you pasted here and only with a cursory glance.

Correct, that’s basically what a method is defined as. I would also argue that if you access a function through an object then it’s also a method, e.g. spam.answer = staticmethod(lambda: 42); spam.answer().

Correct (although don’t get too hung up on “constructor”, etc.; people know what __init__ is for).

No, we aren’t that prescriptive in PEPs. This sort of things would belong in the devguide.

aeros · July 10, 2019, 12:04am

That is pretty much the definition that I subscribe to, probably since Python was my first actual programming language ~5 years ago. So a lot of my basic understanding of programming concepts revolves around the Python docs (which is partly why I’ve been quite motivated to contribute). I’ve just seen some people disagree with that particular definition in the programming community. This stack overflow question for example has many of the different definitions: oop - What's the difference between a method and a function? - Stack Overflow

Anyways, I greatly appreciate the in-depth responses, that pretty much provides clarity on all of my initial questions. Thanks for taking the time to respond.

Now for a bit of an unrelated esoteric question if you have the time (feel free to skip this if you don’t), is there any functional difference between spam.answer and spam.universe in the code below?:

class Spam:
    universe = staticmethod(lambda: 42)

spam = Spam
spam.answer = staticmethod(lambda: 42) # your example

I haven’t messed with static methods too much in Python, particularly using the builtin function instead of the @staticmethod decorator, so your example of using it to demonstrate the creation of a function through an object got me a bit curious. Apologies if this is going off topic a bit.

brettcannon · July 10, 2019, 9:16pm

There difference is one gets set on the class while the other gets set on an instance. I.e. create spam2 = Spam and the first instance will still have universe accessible while the second instance won’t make a difference. There’s also other subtle differences in regards to descriptors and how Python looks things up between an instance and a class, but that’s a whole thing that is better read on than me explaining here (and don’t take my example too seriously as I didn’t even test it before I wrote it ).

aeros · July 10, 2019, 10:54pm

Yeah this is more along the lines of the answer I was looking for, so I’ll definitely be sure to do some more reading up on descriptors. Thanks for pointing me in the right direction! I’m not at all opposed to doing some reading. I just don’t always know what subjects to look for, especially when it comes to the C API. I have only recently started to fully appreciate the value of learning the lower level languages.

I’m sure that a lot of the Python core devs that work on the C API were using C well before they learned Python. As someone who started programming with Python, it can be easy to become somewhat lost in many layers of abstractions. I’m very grateful for how Python simplifies a lot of the underlying complexities and I love the language (probably wouldn’t be here otherwise). But, those abstractions make it possible to be competent with Python while having no real understanding of the C API, or C in general for that matter. I have a feeling the same situation applies to a lot of my generation of developers (20s to early 30s).

Now I have a good excuse to finally start reading this: Introduction — Python 3.12.4 documentation

I really appreciate how all of the core devs have been incredibly helpful and friendly when it comes to interacting with those who are newer to contributing. So far, I’ve had great interactions in particular with vstinner, taleinat, terryjreedy, mariatta, willingc (first commit), and you (first issue). It’s made me motivated to contribute more, and provided me a long term goal of eventually joining the ranks (that’s why I’ve been particularly interested in that “Bug Triager” role proposal). I don’t know how I could thank all of you guys at once, but thanks for being awesome!