In the discussion around embedding a lockfile in a Python script using inline metadata, it was pointed out that some “sneaky” code could hide inside of what could be lots of mechanical lines of meatdata noise.
It only occurred to me to double-check the spec, and it says this:
Every line between these two lines (# /// TYPE and # ///) MUST be a comment starting with #.
So, huzzah, sneaky code violates the spec, but… now what? Since the spec, read literally, tells us that “Unclosed blocks MUST be ignored.”, I think right now the spec is saying “ignore the metadata”, which is good and well, but I think that needs to be tightened now that we are seeing more examples of the danger/harm.
Putting on my archeological hat (its quite dusty), I see that the original PEP 723 acceptance thread mentions unclosed blocks, but I really couldn’t deduce out if the result was intentional or not.
It looks like “error” was on people’s minds, but simplicity in the handling (using a regex) “won out”.
So, I think we’re still spiritually correct if we changed the spec to something like “tools SHOULD error on unclosed blocks, otherwise tools MUST ignore them”.
That’s totally spec compliant. Currently, the existence of non-standard block types is a warning flag, but if we add a block type that’s intended to contain large amounts of opaque data, making type_1 be that type will successfully hide the sneaky code (with no need to exploit edge cases around unclosed blocks).
This is no different than any other case of putting “sneaky” code in a huge comment block, but people currently see large opaque comment blocks as a red flag (look at the huge explanatory comment at the top of get-pip for an example), so the new aspect here is if we normalise large comments containing opaque data.
You’re misinterpreting the spec here, I think. Here’s an example of an unclosed block:
Note the missing # /// closing line. The print line is not part of the block, because it’s not a comment line. What the spec is saying is that the script data above must be ignored (i.e., must be assumed to just be “normal” comments) because it’s unclosed.
The point here was (I believe) to keep the rules on what constitutes a metadata block tight, to avoid false positives. We’re reserving specific forms of comment for a standard purpose - that’s always dangerous because users are explicitly allowed to put whatever they want in their comments, so we make sure we only recognise the smallest possible set of formats, and we continue to treat everything else as normal comments (i.e., ignore them!).
I disagree. Not so much because I think it’s important to accept scripts with unclosed blocks, but simply because it doesn’t actually solve the issue, as I explained above. And given that it doesn’t fix anything, I see no reason to reject what is technically perfectly valid Python code.
I could accept a change to say that tools MAY warn if they ignore an unclosed block with a standardised type (on the assumption that the user intended to include a closing line but forgot). I could even be persuaded to change “MAY warn” to “MAY report an error”, if the case is strong enough.
I’m against any form of error for a block that uses a metadata type that’s not standardised - we should assume they are just normal Python comments.