Enum for open() modes

As title. I just created it:

https://pastebin.com/pNYezw2V

I think it could be useful to have a more descriptive way to declare a file open mode. Many languages, like Java or C#, has an enum for this.

open(), os.fdopen(), os.popen() and pathlib.Path.open() can use it. So I don’t know in which module should be put. In builtins, os or pathlib?

Do you expect many people to prefer

with open('pathname', exclusive_create_and_update_binary) as f

over this?

with open('pathname', 'x+b') as f

I don’t.

Marco Sulla:

[quote]
I think it could be useful to have a more descriptive way to declare a
file open mode. Many languages, like Java or C#, has an enum for
this.
[end quote]

But do they have sixteen enums for this?

I don’t think of file modes as sixteen or more independent
modes that have to be memorised separately, but as a smaller number that
can be combined like Lego blocks to make new modes.

I think it would more acceptable and easier to remember (at least for
me) to have just six named modes that you can combine with the &
operator.

# Named in uppercase for CONSTANTS.
APPEND     # Equivalent to 'a'.
BINARY     # Equivalent to 'b'.
EXCLUSIVE  # Equivalent to 'x', exclusive create.
READ       # Equivalent to 'r'.
UPDATE     # Equivalent to '+'.
WRITE      # Equivalent to 'x', creates or truncates.

rather than sixteen named modes.

I say they are equivalent to the string file modes, not necessarily
strings. I don’t require READ == ‘r’ to return true, but str(READ)
should return ‘r’.

Then if you want to exclusively create and update a binary file, you
could say BINARY & EXCLUSIVE & UPDATE, without caring about the order.
I prefer & operator rather than + because we’re not concatenating
the modes in order, but taking the union in arbitrary order.

For context, there was an issue for this: https://bugs.python.org/issue37918.

I do not like at all. It is much harder to remember 16 names than 6 characters which can be combined with simple rules. And it would be even more difficult to read. It will increase the length of the line. Usually the open mode is used not in isolation, but as a part of complex statement or expression. Compare:

        with open(os.path.join(self._base_dir, logfilename), 'w', encoding='utf-8') as logfile:

and

        with open(os.path.join(self._base_dir, logfilename), OpenMode.truncate_and_write, encoding='utf-8') as logfile:

Current combinations are more or less cross-language, they are known for programmers of many other languages. The OpenMode enum would be unique.

Do you expect many people to prefer

with open('pathname', exclusive_create_and_update_binary) as f

over this?

with open('pathname', 'x+b') as f

The first one is much more explicit and readable. Explicit is better than implicit, and readability matters…

But do they have sixteen enums for this?

StandardOpenOption of Java is an enum with 10 elements. And you have to write StandardOpenOption every flag you want to add, instead of simply OpenMode

I think it would more acceptable and easier to remember (at least for
me) to have just six named modes that you can combine with the &
operator.

# Named in uppercase for CONSTANTS.
APPEND     # Equivalent to 'a'.
BINARY     # Equivalent to 'b'.
EXCLUSIVE  # Equivalent to 'x', exclusive create.
READ       # Equivalent to 'r'.
UPDATE     # Equivalent to '+'.
WRITE      # Equivalent to 'x', creates or truncates.

rather than sixteen named modes.

I say they are equivalent to the string file modes, not necessarily
strings. I don’t require READ == ‘r’ to return true, but str(READ)
should return ‘r’.

Then if you want to exclusively create and update a binary file, you
could say BINARY & EXCLUSIVE & UPDATE , without caring about the order.
I prefer & operator rather than + because we’re not concatenating
the modes in order, but taking the union in arbitrary order.

I agree this is the optimal solution, but unluckily I don’t see a way to implement this without changing the existing functions in such a way that they can accept a flag instead of a string as mode parameter, because you’re describing a numeric flag.

The best you can do currently (after changing update, _update and _binary) is:

OpenMode.exclusive_create + OpenMode.update + OpenMode.binary

that could be more simple to remember that

OpenMode.exclusive_create_and_update_binary

Another (ugly) option is to abandon the enum and make OpenMode a module and the six “constants” variables of the __init__.py of OpenMode, so you can do

from OpenMode import *

with open(filepath, exclusive_create + update + binary):
    [do stuff]

Of course I do not like this solution at all, since even if it’s more compact, the variables will be not part of a standard enum, and they are not constant.

Current combinations are more or less cross-language, they are known for programmers of many other languages. The OpenMode enum would be unique.

Not at all, I already mentioned Java, and it would suffices, since it has more than the 50% of usage share in the world… anyway there’s also C#, QT and C++.

(Marco is quoting Steven D’Aprano)

Actually in context I find the second one explicit and the first one somewhat uncertain. I would have to look up your enums every time.

(Steven said:)

I’d rather combine them with | myself: & does not mean “union” to me at all!

I’m fairly sure you can do better than that just defining the appropriate dunders. However, all that said, I’m not the least bit convinced that this is worth putting in the standard library. It’s not terribly pretty, it’s not all that useful and it replaces something well known. I just don’t see the point.

I’m fairly sure you can do better than that just defining the appropriate dunders.

…Ok, it seems to me an overkill, but…

I can create the class OpenFlag(str), with three attributes, type, data and update, that will implements __or__ in such a way that, for example, a | b will return a new OpenFlag, with the attributes of a and any of the three attributes of b that replaces the relative attribute of a, if it is not null and the relative attribute of a was null. If both are not null, an exception will be raised. The __repr__ of OpenFlag will return a string that is a concatenations of the three attributes, when they are not null. So the enum will became:

class OpenMode(OpenFlag, Enum):
    read = OpenFlag(type="r")
    append = OpenFlag(type="a")
    truncate_and_write = OpenFlag(type="w")
    exclusive_create = OpenFlag(type="x")
    update = OpenFlag(update="+")
    binary = OpenFlag(data="b")

and, for example, "x+b" can be written as

OpenMode.update | OpenMode.exclusive_create | OpenMode.binary

Anyway no one can stop me to add also this one to the enum and name it exclusive_create_and_update_binary. You can use it or you can use the combination above or you can use "x+b", it’s up to you. IMHO

OpenMode.exclusive_create | OpenMode.binary | OpenMode.update
or
OpenMode.exclusive_create_and_update_binary

are much more readable also to someone that doesn’t know Python.

It’s pretty because it’s more readable. It’s useful because is more descriptive (I always forget that w is truncate and write…). And it does replace nothing, it only adds a more readable way to explicit how you’re opening the file, that you can use or not.

And the point is Readability counts.

Furtermore, in this way you can do also something like this:

OM = OpenMode
mode = OM.append | OM.binary

if Role.read in user.roles:
    mode |= OM.update

with open(filepath, mode) as f:
    [...]

Not everyone reads the same way you do. I find the LongName.really_long_and_description_text version to be much more difficult to read.

As you say, you can create your own class for your own use.

1 Like

OpenMode is a long name? And what about PendingDeprecationWarning? :smiley:
Honestly I can’t find a shorter name for the enum.

For the element names, I can shorten truncate_and_write and exclusive_create in write and exclusive. But the first is ambiguous. It seems like to open the file and you can write in it, while the file will be truncated before. I can shorten it in truncwrite. And update and binary can become upd and bin

About composite elements, I think I can shorten them in something like exclusive_upd_bin. Anyway, I think the 80% of the Python code in the world will use only append, truncwrite, upd, truncupd (“w+”) and append_upd (read is the default).