Policies for tarfile.extractall, a.k.a. fixing CVE-2007-4559

I think we should either make an exception to the general theme, or not backport at all.
Assuming the attribute is extraction_filter (but it would work the same if it was hidden under _policy as you suggest), and that the function that implements a “safe” policy will be called tarfile.data_filter():

  • my_tarfile.extraction_filter = getattr(tarfile, 'data_filter', lambda x: x) says “be secure if possible, silently allow everything if not”, and it’s verbose enough to imply the author really means that.
  • my_tarfile.extraction_filter = (lambda x: x) says “trust this archive” and works everywhere
  • There’s no good way to “always be secure”, short of using a third-party backport from PyPI or writing one yourself.
  • my_tarfile.extraction_filter = 'data' would make older/unpatched versions of Python ignore an explicit security-related request, which is terrible. So, whatever else I’ll do, the attribute will not accept the string shortcuts.
  • my_tarfile.extraction_filter = tarfile.data_filter would only work on patched versions of Python. That just shows that exposing the filter functions is a change in API surface.
    • If we allow that, why not allow extractall(filter=tarfile.data_filter) too?
    • If we don’t expose the function, there’s not much left from the feature to backport. (Maybe the warning by default, but we don’t add new warnings in patch releases.)

It woud be nice to allow third-party code to write:

if hasattr(tarfile, 'data_filter'):  # (or some other feature check)
    # new way of doing things
    my_tarfile.extractall(filter='data')
else:
    # remove this after a deprecation period
    warn('Extracting may be unsafe, consider updating Python')
    my_tarfile.extractall()

rather than

if hasattr(tarfile, 'data_filter'):
    my_tarfile.extraction_filter = 'data'
    my_tarfile.extractall()
    # XXX: reset the filter? Do we need to be thread-safe here?
else:
    # remove this after a deprecation period
    warn('Extracting may be unsafe, consider updating Python')
    my_tarfile.extractall()
3 Likes