Adding atomicwrite in stdlib

Sorry, I didn’t mean to sound entitled. If that’s how I came across, I do sincerely apologise. I merely wanted to show support for this proposal.

1 Like

Summarising why this idea lapsed: it isn’t because there’s disagreement about the utility of the feature, it’s because the exact nature of the feature means that any particular implementation needs to make trade-offs when it comes to platform assumptions. Since the desirable trade-offs can vary based on the exact use case, a standard library solution needs to choose between adding API complexity to support behaviour tweaking, or leaving some users having to roll their own independent solution anyway.

It’s genuinely unclear how a stdlib API for staged writes should handle that question (or which design trade-offs it should make in the first place).

If this is added, the stdlib documentation also ends up having to take on the burden of explaining the entire concept behind the feature to new users, rather than assuming that users already understand the problem the feature solves. (This concern is what underlies the naming discussion in this thread, since perpetuating dubious terminology is problematic, even when that terminology is commonly used)

PyPI libraries don’t have the latter problem (anyone looking for one presumably knows the problem they’re trying to solve), but the first API design issue applies. Since you can roll your own decent solution from existing stdlib components in less than a couple of dozen lines, a lot of folks will also be reluctant to take on an external dependency just for that.

So, yeah, the feature ends up stuck in an indefinite limbo, where it’s more a design pattern with recipes of variable quality available from different sources rather than being a readily importable general purpose API.

5 Likes

I am sorry as well, I was not offended, more curious what could you actually mean, when you obviously couldn’t mean what I read.

2 Likes

Is it an option to offer a high quality recipe in the Python docs similar to itertools recipes?

2 Likes

Yes, a recipe in the shutil docs could be a less controversial starting point, since it can make some simplifying assumptions that would be far more debatable in an actual API implementation:

  • don’t worry about the non-context manager case, as the point of the recipe would just be to illustrate the 3 different pieces involved in implementing a staged write (1. creating the temporary staging file; 2. destroying that file if there is a problem; 3. replacing the target file if everything goes according to plan), and mention how they correspond to database transaction steps (start; rollback; and commit, respectively)
  • use the simple approach of a visible file adjacent to the actual destination with a custom extension created by a call to secrets.token_hex() (no need to get tempfile involved, or mess about with hidden files), but mention in a comment that different use cases may call for different naming schemes for the staging file
  • note in a comment that os.replace() is almost always atomic within a directory, and creating the staging file in the target directory means that inherited file permissions and other properties are more likely to be correct
  • just mention os.fsync in a comment rather than actually calling it

It would still need to address the terminology problems with calling these atomic writes, though.

A recipe discussing staged writes should also mention that they only ensure that readers will never see a partially written file. They don’t guard against lost updates if multiple writers are reading the previous file contents, making changes, and then writing back the result. For use cases where write contention genuinely needs to be handled, a file-backed database (such as sqlite) is likely to be a better data storage solution than a flat file.

7 Likes

And if anyone wants one of these in the short term, my own version’s on
PyPI in the cs.fileutils package as atomic_filename:

It has several knobs, and I can readily see much bikeshedding for anyone
trying to produce something for the stdlib.

2 Likes