UNC path lost with Path constructor

On a Posix system, initial doubled path separators are supposed to be significant and are kept by the Path() constructor (while in other locations they are consolidated.

>>> Path("//foo//bar")
PosixPath('//foo/bar') #double slash retained

But on windows (where the double slash indicates a UNC path), this is lost.

>>> Path("//foo//bar")
WindowsPath('/foo/bar') #double slash lost
>>> Path(r"\\foo\\bar")
WindowsPath('/foo/bar')

This has the effect of destroying the UNC path. Is this a known bug/limitation, or something else? I can manually construct either the PureWindowsPath or WindowsPath with the double slash and it works properly. But I would prefer to use the Path() constructor.

What WindowsPath does is wrong, but for a different reason than you might think. The path "//foo//bar" isn’t a valid UNC path in Windows (spec), since there’s more than one slash between “foo” and “bar”. The path has a server (host) component named “foo”, but no share component. (A UNC provider could mount a filesystem directly on the server component, such as "//foo/", but this would violate the UNC spec.) The "//bar" remainder is a file path, which normalizes as "\\bar" (only one backslash). Indeed, WinAPI GetFullPathNameW() normalizes the path as "\\\\foo\\bar", which happens to be a valid UNC path for a share named “bar”.

Add a trailing ".." component, however, to see the difference. The invalid UNC path "//foo//bar/.." normalizes as "\\\\foo\\", while the valid UNC path "//foo/bar/.." normalizes as "\\\\foo\\bar" because the “bar” share is the root of the path.

WindowsPath turns the invalid UNC path "//foo//bar", which is an absolute path, into an unrelated rooted path "\\foo\\bar", which is a relative path in Windows. A rooted path is resolved against the drive or UNC share of the current working directory. WindowsPath should retain the UNC absolute nature of the path, whether the path is valid or not.

Okay, that’s what I get for trying to be cute. I thought I was just showing how normally the separators are consolidated with my example and it didn’t matter.

My actual issue was different, but I think your explanation helps that as well. I was letting Path combine the server and share components (which I assumed would be with a single separator). So what I was really doing in the code was:

>>> Path("//foo", "bar")
WindowsPath('/foo/bar')

Now I think I get that the lack of a share is causing it to normalize as a local path first, then it appends the later component. I need to just pass in the UNC path as a single element.

>>> Path("//foo/bar")
WindowsPath('//foo/bar/')

For some reason I never tried that, thinking that the Path concatenation would have been identical.

It’s a bug in WindowsPath. For example:

>>> pathlib.Path('//foo')
WindowsPath('/foo')

The correct result retains the UNC absolute path status, even though it’s not a valid UNC path when there’s no share component. For example:

>>> nt._getfullpathname('//foo')
'\\\\foo'

Is there an open issue on it? I was trying to look through the bug tracker, but didn’t know how to keep “UNC” from matching all the instances of “function” and gave up.

I searched for a related issue, but I don’t think there’s an open issue for the improper handling of invalid UNC paths in pathlib. There’s a tangentially related open issue (bpo-33898) about device paths of the form "\\\\?\\" and "\\\\.\\", which the UNC spec refers to as the “Win32API selector” and “device selector”. I opened the issue three and a half years ago. It has an unmerged PR that warrants attention. IIRC, the PR significantly updates parsing of UNC paths, which may indirectly resolve the problem of invalid UNC paths that lack a share or server component.