Extending os.rename() to support file swapping and whiteout

On many Linux filesystems and some macOS file systems (and probably some other platforms) the extended versions of the rename system call support the option swap the source and destination as an atomic operation, as opposed to just replacing the destination with the source. This is a very useful capability, since it avoids race conditions that might otherwise occur. It would be helpful to expose this capability in the os.rename() function.

Additionally, on platforms that support union mount file systems, there is the concept of a whiteout object that can be placed in a file system mounted higher in the union stack that masks an object in a lower mount. At least on Linux, the extended versions of the rename system call supports moving the source to the destination and also placing a whiteout object at the source location (so that a source object in a lower layer is no longer visible). Exposing this functionality in os.rename() would also be helpful.

It is possible to access all of this functionality by importing the standard C library using ctypes and accessing the relevant calls (renameat2() on Linux or renamex_np() on macOS) but the resulting code is somewhat opaque and platform specific. It seems to me that adding keyword flags to os.rename(), so that a user can just type somethign along the lines of os.rename(src, dst, swap=True) would be much more readable, and would be portable at least across platforms that support this functionality.

It should be noted that there is a package on PyPI called renameat2 that has a Linux-specific implementation of this, but the name is sort of obscure unless you know the underlying system call, and it’s not cross platform.

So, my question for this list is: should this functionality be added into the os module? If the consensus is yes then I’m willing to have a stab at implementing it for macOS and Linux. (If there is similar functionality on Windows the I’m afraid that someone else will need to add that, since I don’t have any Windows machines.)

1 Like

The renamat2 calls it exchange not swap.

If adding this then why not also add RENAME_NOREPLACE as well as RENAME_WHITEOUT?

I cannot find docs for macos renameat_np to see its features.

Hi @barry-scott . Thanks for the input.

The renamat2 calls it exchange not swap.

On Linux that’s the case. On macOS’s renamex_np() the flag is called RENAME_SWAP. Personally I think that the term “exchange” gets overly used in the Windows world to mean something completely different, the meaning of “swap” is more immediately apparent than “exchange” in most contexts, and swap is shorter to type, but I don’t really mind much either way.

If adding this then why not also add RENAME_NOREPLACE as well as RENAME_WHITEOUT?

We certainly could include that for completeness. That said, RENAME_NOREPLACE is already the default behaviour for os.rename() and there is a separate os.replace() function. I seem to recall reading somewhere that “There should be one-- and preferably only one --obvious way to do it.” :slight_smile:

I cannot find docs for macos renameat_np to see its features.

There is a recent version of the relevant macOS man page here.

On Windows, os.replace() supports replacing an existing destination file, and os.rename() does not. They both call MoveFileExW(), with or without the flag MOVEFILE_REPLACE_EXISTING.

On POSIX, os.rename() and os.replace() behave the same. They both call either rename() or renameat(). The behavior of renameat2() with the flag RENAME_NOREPLACE isn’t supported by os.rename() on POSIX.

1 Like

@eryksun Thanks for pointing that out. I didn’t realise that on POSIX systems os.rename() does the same as os.replace().

In that case, yes, it would probably be a good idea to also support some sort of replace=False flag on POSIX platforms that support it. This then begs the question of if os.rename() should also allow the user to set replace=True on Windows for consistency?

Setting aside the whole can of worms that is the inconsistency of the replacement behaviour between POXIS and Windows, I’d still be interested in feedback on the idea of supporting the file swap/exchange behaviour.

Should is not must. This is really asking for designing with care.

As you can already see there is os.rename and os.replace.

In programming exchange is the keyword that I see used for atomic operations, not swap. For example in CPU instructions used to build mutex etc.

This is all non-POSIX API that give you the exchange etc.
You need to talk in terms of OS type, windows, macOS, Linux, netbsd etc.

Clearly the default behaviour of os.rename must not change as so much code depends on its exact implementation details. I have run strace on python to check how it did rename because it is so important.

I do agree that have a rename that does not replace on linux woyld be nice to fix.

Given the pattern of os.rename and os.replace should there be os.rename_exchange? And only implement this if the OS has support for the feature.

1 Like

I agree that if we do this it should be a new function, not an option on rename() – swapping files doesn’t feel like a renaming operation any more. If we do this I’d suggest just naming it os.exchange() – os.swap() could work too, but as a name for a pretty obscure function I think it’s too short (I have a general intuitive rule of thumb that more important/common things should have shorter names).

But do we really need this? Surely not every syscall in existence (on Linux or elsewhere) needs to be represented in the os module? Since there’s already a 3rd party package that does this, maybe that’s enough?

3 Likes

It is an old pattern. Separate function is more preferable than a single boolean parameter. But in 3.3 a lot of boolean parameters were added to a number of existing os functions. Adding a separate new function for every combination of boolean parameters would explode the number of functions exponentially. So now we have os.stat(follow_symlinks=False) which is equivalent to old os.lstat(), and the latter was kept for compatibility. Since there was no os.rename_exchange() before, there is no need to add it in addition to os.rename(exchange=True).

I like the idea of adding the replace flag in os.rename(). It will unify behavior on Windows and Linux. And if we already use the renameat2() or renameatx_np() system call, why not add also the exchage flag? Or maybe add a three-state parameter which takes values 'rename'/'replace'/'exchange'?

But do we really need this? Surely not every syscall in existence (on Linux or elsewhere) needs to be represented in the os module?

Clearly not every syscall needs to be represented in the os module. That said, being able to do an atomic file exchange is a generally useful operation when trying to write robust code that avoids filesystem race conditions, but it’s also something that is not possible to do in pure Python without resorting to platform-specific syscalls.

As for it being a new function versus a flag on os.rename(), I understand that the original basis for the system call being done as a flag on an extended version of rename was that the operation represents renaming the two files simultaneously. Personally I feel that that logic is only intuitive after someone has told you, and a new function would be a good idea. I didn’t propose that initially just because there was already a function in the os module that uses a variant of the relevant syscall and I thought adding a flag would be less contentious than adding a whole new library function.

Since there’s already a 3rd party package that does this, maybe that’s enough?

Well, the existing 3rd party package is not cross-platform and is hard to find unless you know the name of the underlying system call. If there isn’t support for putting this in the os module then I might write a multi-platform module with a more intuitive name and put it on PyPI (name suggestions welcome!), but I still think that this is broadly useful enough to warrant going in the standard library.

If it can be made cross-platform then I’m +1. But I got the impression that this was a Linux-only thing, and in that case I don’t see the benefit (you’re already platform-specific, so what’s the harm in using platform-specific syscalls).

Apologies if I’m missing something here - I’d mostly tuned out on the discussion as it seemed to be basically “here’s another Linux feature it might be nice to add” (which I generally try to avoid getting involved in, as I’ll only get grumpy about the various Windows features that could be added, but don’t get the same level of interest :person_shrugging:)

An exchange/swap rename is available on Linux via renameat2() and macOS via renamex_np(). It’s thus cross-platform for the two popular POSIX platforms.

There’s no direct equivalent on Windows. I suppose filenames “A” and “B” can be swapped atomically using a kernel transaction handle from CreateTransaction(), three MoveFileTransactedW() calls, CommitTransaction(), and CloseHandle(). Note that Microsoft has warned for a few years now that it might deprecate and remove the kernel transaction manager because it’s a complex system component that’s hardly ever used.

For example:

import os
from win32transaction import CreateTransaction, CommitTransaction
from win32file import MoveFileWithProgress # MoveFileTransacted
>>> open('A').read()
'A\n'
>>> open('B').read()
'B\n'
>>> os.path.exists('T')
False
h = CreateTransaction()
MoveFileWithProgress('A', 'T', Transaction=h)
MoveFileWithProgress('B', 'A', Transaction=h)
MoveFileWithProgress('T', 'B', Transaction=h)
>>> open('A').read()
'A\n'
>>> open('B').read()
'B\n'
>>> os.path.exists('T')
False
CommitTransaction(h)
h.close()
>>> open('A').read()
'B\n'
>>> open('B').read()
'A\n'
>>> os.path.exists('T')
False
1 Like

For the record, I have created a package called atomicswap that serves this need and made it available on PyPI, with the source on Github. It currently works on Linux, macOS and Windows.

Some comments on the linux code.

Why use syscall where you could call renameat2 instead and avoid needing to know the syscall numbers for each arch? How sure are you that the syscall numbers are stable?

Why not name the flag as it is named in the man page for renameat2, RENAME_EXCHANGE?

@barry-scott The reason that I’m using syscall is that the renameat2 function is not exposed in all versions of C library, so that doesn’t work across different Linux systems, or indeed on older macOS versions. The syscall numbers are explicitly stable between versions of Linux kernel, but vary between platforms; their values are well documented.

The reason for the flag naming is that I wrote this first on macOS, where the they use SWAP instead of EXCHANGE. I guess it could use the different name on the Linux code. Feel free to submit a PR if you like!