Deprecation of pathlib.PurePath.is_reserved()

For reference these paths are invalid on macOS.

isreserved isn’t testing for “invalid” - invalid paths will raise an error every time they’re used.

The reserved names are specifically valid paths that mean something other than a regular user-created file, e.g. things like /proc.

isreserved() checks for invalid characters:


Sure, like nul on Windows.


By the way, did you check out @eryksun’s benchmark on my PR yet?

You should probably also stay out of /dev/**.

I remember some Linux systems that mounted a tmpfs at /dev/shm, so any program could reasonably use /dev/shm/whatever as a regular file in exactly the way one would use /tmp/whatever.

I’m not sure if any of those are still in use, but it’s probably enough reason not to make a hypothetical Linux isreserved function reject anything starting with /dev/.

Personally, I don’t think anything should be treated as reserved on Linux. There are many legitimate cases for interacting directly with files in /dev, /proc, /sys, etc. so I don’t think it would be very helpful to try and “block” them.

2 Likes

Equally, there are legitimate reasons to use the reserved names and path formats on Windows, which is why I argue nobody should be using this function to block anything in the first place.

In the context of an app that creates a zip file, not naming things nul is just as important as not naming them /proc/..., but ultimately the only tool that can protect the user is the one doing the extraction.[1] Thinking you can protect against issues ahead of time is never going to work.

But equally, if the intent is to warn, then you’d want to warn just as much about nul as /proc/.... So why not let ntpath and posixpath check for it, especially if there’s a clear emphasis on using the specific platform’s function?


  1. By doing a stat on the destination file, before and after creation, to ensure it is/going to be a regular file. ↩︎

1 Like

Is this everything now?

  • Embedded null
  • Too long file name: "a" * 256
  • Too long path: "a/" * 512
  • /proc/**
  • /dev/**

Start a new issue for Linux support and post a link here so anyone interested can join in.

@eryksun, are there also adjustments that need to be made to ntpath.isreserved()?

  • //./PIPE/**
  • //./MAILSLOT/**

Edit: no these aren’t reserved.

I think there are several important differences between the Windows situation and the Linux one.

  1. I don’t believe there have been legitimate reasons to use nul or con or lpt* on Windows for many years now; they provide legacy functionality left over from DOS, whereas Linux /dev, /proc, /sys are used for current, very much non-legacy functionality.

  2. The Windows reserved names are magically applicable in any directory anywhere, whereas in Linux, /dev, /proc, /sys are just particular directories in /. While it is somewhat unlikely that a user on Windows will want to save a file as con in whichever folder they keep their documents in, it is much less likely that a user on Linux will accidentally try to save a file as /dev/stdout, including the absolute path.

  3. /dev, /proc, /sys in Linux appear in a directory listing of /. It is clear to the user that they exist, even if the user does not know what they are for, and most users, even the less technically inclined, do not go saving their files in random system folders that they don’t understand. Meanwhile, on Windows, the reserved names such as con don’t exist at all: there is no indication whatsoever that, inside an empty folder, it’s fine to create a file called foo but it’s a really bad idea to create a file called con.

There’s also one more thing I think is important here: status quo. Perhaps there is a small argument that is_reserved should never have been added for Windows reserved names at all; perhaps there is a small argument that is_reserved should return True for /dev, /proc, /sys on Linux. But I think there are enough counterarguments that it is better to leave things as they are.

6 Likes

The “nul”, “con”, “conin$”, and “conout$” devices have legitimate uses, while “aux” and “com[1-9]” serial ports and “prn” and “lpt[1-9]” parallel ports aren’t used much anymore. Wherever possible, applications should use these devices explicitly in the “\\.\” or “\\?\” device namespaces, such as “\\.\nul”, not as if they exist in the current directory.

Unfortunately, the legacy CMD shell and other legacy applications may not support device paths in general, so Microsoft has to continue to make the legacy DOS device names available in the current directory. There are many batch scripts that assume this, such as by redirecting output to unqualified nul (e.g. spam.exe > nul), or reading from unqualified con (e.g. copy con spam). I just wish there was a way to opt out of this legacy behavior.

1 Like

I never realized that there was a qualified path version of nul on windows. TIL.

Honestly is_reserved() sounds a bit too opinionated without a spec or something to specifically fall back to.

Should it mean: ‘Can a file of some sort be made at this path?’ Or: ‘Should I use this path?’

Neither are fully convincing to me though the first one is better if we’re going this route in my eyes.

On Linux, why would /dev be reserved? I don’t think we’re in the business of deciding if the path is ok or not, at least this way. Similarly to nul in windows… only the OS/filesystem really know what’s ok.

If we must have something, maybe it shouldn’t be is_reserved(), maybe it should be is_touchable() and have it touch then delete if nothing exists and if something does exist return true.

All that said: my argument is against is_reserved() as a whole. But if it exists on os.path, I think should exist on Path for completeness.

1 Like

Charles, “\\.\nul” is basically what the API opens for “nul”. For example:

>>> nt._getfullpathname('nul')
'\\\\.\\nul'
>>> open('nul')
Breakpoint 0 hit
ntdll!NtCreateFile:
00007ffc`344503e0 4c8bd1          mov     r10,rcx
0:000> !obja @r8
Obja +0000000000000000 at 0000005338beee30:
        Name is \??\nul
        OBJ_CASE_INSENSITIVE

“\??\” is the NTAPI equivalent of WinAPI “\\.\” and “\\?\”. The Object Manager implements “\??” as a union of a per-logon local object directory “\Sessions\0\DosDevices\<logon ID>” and a global object directory “\GLOBAL??”. The local object directory contains object symlinks that get created for ‘devices’ in a user’s logon session (e.g. a user’s mapped drives and substitute drives, such as “W:” → “\??\C:\Windows\System32”). The global object directory is for system devices (e.g. the system volume “\GLOBAL??\C:”). The name “\??\nul” is the object symlink “\GLOBAL??\NUL”, which targets the NT device object “\Device\Null”.

Again, that only works if the code is running where that filesystem exists, with the additional restriction that all the parent paths have to exist too.

If the discussion is to continue it should really focus back on its core utility - checking nonexistant paths before anyone (including completely different programs on different machines) tries to use them on a real filesystem.

e.g. if you were putting them in a zip file.

2 Likes

What “again”? I suggested it in a different context (at the point of extraction).

The old function never had a defined purpose, and neither does the current one. So figuring out what this is needed for is a very good place to start, and then we can figure out if such a need belongs in the standard library or somewhere else (which is how we are meant to approach all proposals, though sometimes it gets missed).

(Note that a purpose is not “what it does” but rather “what it’s for”. We knew what the old one did, but nobody had said why it needed to do that.)

1 Like

I think a function like is_portable() belongs on PyPi (at least for now). People will rely on it rather than adding their own logic and getting it right will take a really long time.

1 Like