-X importtrace to supplement -X importtime for loaded modules

Hi! I sent this proposal to the Python Ideas mailing and was redirected here, so apologies (and thanks) to anyone who’s already read this :slight_smile:

I’d like to propose an adjacent interpreter flag to -X importtime: -X importtrace (open to alternative naming).

While -X importtime is incredibly useful for analyzing module import times, by design, it doesn’t log anything if an imported module has already been loaded. -X importtrace would provide additional output for every module that’s already been loaded:

>>> import uuid
import time: cached    | cached     |   _io
import time: cached    | cached     |   _io
import time: cached    | cached     |   os
import time: cached    | cached     |   sys
import time: cached    | cached     |   enum
import time: cached    | cached     |     _io
import time: cached    | cached     |     _io
import time: cached    | cached     |     collections
import time: cached    | cached     |     os
import time: cached    | cached     |     re
import time: cached    | cached     |     sys
import time: cached    | cached     |     functools
import time: cached    | cached     |     itertools
import time:       151 |        151 |     _wmi
import time:     18290 |      18440 |   platform
import time:       372 |        372 |   _uuid
import time:     10955 |      29766 | uuid

In codebases with convoluted/poorly managed import graphs (and consequently, workloads that suffer from long import times), the ability to record all paths to an expensive dependency–not just the first-imported–can help expedite refactoring (and help scale identification of this type of issue). More generally, this flag would provide a more efficient path to tracking runtime dependencies.

As a proof of concept, I was able to hack this functionality into -X importtime by adding a couple lines to import_ensure_initialized in Python/import.c (hence the output above). A separate flag is probably desirable to preserve backwards compatibility–maybe -X importtrace would only show cached imports and you’d supply both to get the full output for maximum flexibility?

Looking forward to your feedback,


I’ve had situations where this would have been very handy; given the implementation is quite simple, I’d support adding it.

Over on the mailing list @methane suggested -X importtime=2 to avoid needing a second -X flag; I like that a lot as well, since this is a minor extension of importtime; it’s like importtime verbosity level 2.

I’ve started drafting the PR and there are two small issues with this approach:

  1. Users might already be spelling the option -X importtime=true (or with any string). since the value is currently unchecked. This would technically break backwards compatibility; we’re retroactively constraining valid option usage.

  2. It’s unclear how to validate the corresponding environment value PYTHONPROFILEIMPORTTIME: should we error if this isn’t “1” or “2”? Should we only enable cached import tracing if the value is “2”?

While I initially thought augmenting this flag was the best approach, I think adding a new one could markedly simplify these issues :frowning:

Since the docs don’t say that -X importtime=<anything> is valid, it seems that changing the semantics is somewhat defensible. To be concrete:

  • -X importtime or -X importtime=1 for the current behavior
  • -X importtime=2 for the new, “show already loaded” behavior
  • -X importtime=<anything else> is an error

I am less sure what would be reasonable for the environment variable, though. The docs currently say

If this environment variable is set to a non-empty string, Python will show how long each import takes.

Retroactively making this only support "1" or "2" seems aggressive, but maybe it’s fine? Other suggestions:

  • "2" is special, but any other value gives the old behavior
  • A separate environment variable

We can provide extended behavior for a special value (e.g. 2) without constraining the only valid values to 1 or 2. Since this is a debug-only tool, I don’t think it’s a big deal if someone who for some reason is already setting it to 2 now gets some additional extended output. But I don’t think we should make values other than 1 or 2 an error. This may seem weird, but practically I think it’s fine and a reasonable concession to backward compatibility.