Policies for tarfile.extractall, a.k.a. fixing CVE-2007-4559

Post-acceptance changes!

Eagle-eyed @hroncok reviewed the PEP and implementation, and found ambiguity in error handling. I intend to add the following summary & resolution to the PEP (with SC approval):

Errorlevel, and fatal/non-fatal errors

Currently, TarFile has an errorlevel argument/attribute, which specifies how errors are handled:

  • With errorlevel=0, documentation says that “all errors are ignored when using extract and extractall”. The code only ignores non-fatal and fatal errors (see below), so, for example, you still get TypeError if you pass None as the destination path.

  • With errorlevel=1 (the default), all non-fatal errors are ignored. (They may be logged to sys.stderr by setting the degug argument/attribute.) Which errors are non-fatal is not defined in documentation, but code treats ExtractionError as such. Specifically, it’s these issues:

    • “unable to resolve link inside archive” (raised on systems that do not support symlinks)
    • “fifo/special devices not supported by system” (not used for failures if the system supports these, e.g. for a PermissionError)
    • “could not change owner/mode/modification time”

    Note that, for example, file name too long or out of disk space don’t qualify. The non-fatal errors are not very likely to appear on a Unix-like system.

  • With errorlevel=2, all errors are raised, including fatal ones. Which errors are fatal is, again, not defined; in practice it’s OSError.

A filter refusing to extract a member does not fit neatly into the fatal/non-fatal categories.

  • This PEP does not change existing behavior. (Ideas for improvements are welcome in Discourse topic 25970.)
  • When a filter refuses to extract a member, the error should not pass silently by default.

To satisfy this, FilterError will be considered a fatal error, that is, it’ll be ignored only with errorlevel=0.

Users that want to ignore FilterError but not other fatal errors should create a custom filter function, and call another filter in a try block.

FWIW, I think errorlevel didn’t age well. If I designed it today, it would look like this: ExceptionGroup for TarFile.extractall


Additionally, the PEP currently says the data filter will:

  • Remove the group & other read permission (S_IRGRP|S_IROTH) if the owner doesn’t have it (stat.S_IRUSR).

This never happens, since the data filter sets read permissions for the owner. I found this in the implementation but forgot to update the draft PEP, and then copied the PEP to the docs.
I’ll remove the point. Thanks @hroncok again for finding it.

2 Likes