Symlink pointing outside of the source directory in shutil.copytree
The issue
Recently, I have been poking around with file systems and operations with directories more. In a discussion with my friend, we identified a potential issue: symlinks with relative paths that point outside the source directory.
Imagine a source folder, something like /folder1/folder2/source which contains a symlink /folder1/folder2/source/link pointing to a relative path ../../folder3/file.py, the expansion of which would result in /folder1/folder3/file.py.
That’s okay and a valid symlink, as long as we don’t try copying it over to another folder as-is. Let’s say we want to copy the contents of our folder /folder1/folder2/source into /folder1/dest. This would result in a symlink /folder1/dest/link with the same relative path ../../folder3/file.py, which would, however, try to expand into a path of /file.py, which does not have to be present.
The root of the FS is an extreme example; the same could happen in any other part of the tree, e.g., you want to copy into /folder4/folder5/folder6 which would result in a symlink /folder4/folder5/folder6/link expanding into /folder4/file.py which doesn’t have to be present, or can be a completely different file.
Proposed change
I propose checking the paths of these symlinks and replacing them with the absolute path if they point outside the source folder with a relative path.
In the above examples, this would result in symlinks pointing to /folder1/folder3/file.py as an absolute path, even in the destination directories.
A function I came up with (pasted from my project, edited to be usable outside of the class):
def symlink_in_source(source_path_abs, symlink_path) -> str:
"""Check if a symlink is pointing inside the source folder.
Returns absolute path in case symlink points outside of source folder.
Args:
source_path: path to source folder
symlink_path: path to symlink
Returns:
str: path to which the symlink should point
"""
symlink_path_abs = os.path.abspath(os.path.join(source_path, symlink_path))
logging.debug(f"Symlink absolute path: {symlink_path_abs}")
if os.path.abspath(source_path) == os.path.commonpath(
[os.path.abspath(source_path), symlink_path_abs]
): # alternatively `os.path.abspath(source_path) in symlink_path_abs[:len(os.path.abspath(source_path))]`
return symlink_path
else:
logging.debug(
f"Symlink path leads outside of source folder, using absolute path: {symlink_path_abs}"
)
return symlink_path_abs
Backwards compatibility
I believe this change is backward compatible.