Require for feedback: module "pyv" and ".pyv" file

Here is a module for introducing and running the file.

The “.pyv” file like this:

print(
$ version >= 3.11
"new_python"
$ _
"old_python"
$
)

The “$” will control which codes will be seen by python interpreter.

The module can solve the question by syntax between different python module. For example, the code below is invalid below 3.13:

if sys.version_info >= (3, 14):
    username = t"{name}"
else:
    username = f"{name}"

Though the condition is False, the python under 3.13 will still raise SyntaxError because it needs to compile the whole code but python under 3.13 cannot analyse t-string syntax.

Now in “.pyv” file you can write like this:

$ version >= 3.14
username = t"{name}"
$ _
username = f"{name}"
$

Note

The “$” support “version” (only major and minor), “platform”, “implement” (e.g. “cpython”, “pypy”).
“$ _” means that if the condition above are wrong then analyse the code below.
Only one “$” meand that end the analyse.

2 Likes

What would you do down the line with username? The f-string syntax creates a str object, while the t-string syntax creates a template object. So, you could use username only in contexts where both is allowed, but then you could use just use a f-string.

So, I think there are only very rare use cases for this kind of preprocessing. In the rare cases that one really wants to both use a new syntax feature and provide a fallback, one can move the corresponding parts to different files and do conditional imports:

if sys.version_info >= (3, 14):
    import my314script.username as username
else:
    import my313script.username as username
1 Like

However, some modules will insert their import hooks to sys.meta_path. Oftern is sys.meta_path.insert(0, hook). If the hooks from the third party find the modules first, the module “my314script” and “my313script” will not be your scripts. So this solution is not stable.

This is ridiculous. Following your logic, no import would be stable. Even objects in __builtins__ can be overwritten, so using them would not be stable either.

But back to my point:

You did not answer this. If you do not have an answer, you have no use case. You asked for feedback. This is my feedback.

Not only which will be imported, the size and the difficulty for review the codes are the disadvantages of “creating too many modules”.

That’s doesn’t matter. Many syntax can conflict in different versions.

Interesting. I’d describe this as a pre-processor, not a compiler. Many people have tried to build macros into Python before this one.

I do find the C preprocessor #ifdefs useful. But without those directives, it’s much much harder to write cross platform C than cross platform Python. So I strongly prefer the pure Python syntax:

if sys.version_info >= (3, 14):
    username = t"{name}"
else:
    username = f"{name}"

to the $ s (that look like Bash sessions to me). To the extent that I’ll happily live without t strings if supporting pre 3.13 and below.

1 Like

If the new syntax can bring performance optimization, maybe the “pyv” way is better (new python version can run final code fast).

If.

The less said about code that runs text using exec the better. But as I said, the interesting thing for me is the preprocessor:

Why don’t you add a pathlib.Path.write_text, and make a tool that can use the pre-processor to take a .pyv file, and produce a .py file, that can be imported, not only run as a script from the command line? This would remove the requirement for end users to install pyv-file-running themselves, which even if it’s a dependency, they must associate with .pyv files or directly invoke it to launch their .pyv scripts

Avoid of all that, and it could be a useful build tool to navigate platform specific headaches, for projects that ship different wheels for different targets.

2 Likes

Of course it matters since you only provide contrived code snippets. You outright reject to provide even a single example where your preprocessor would make sense.

C and C++ have a preprocessor with the ability to conditionally include code. So some people might think that Python needs that too. But in C it is not possible to do the following things without the preprocessor

  • (conditionally) include headers,
  • conditionally define or declare functions (this cannot occur within an if),
  • conditionally call functions depening on libraries available (because name lookup even happens for functions in a if (false) { ... } block).

An incomplete syntax is rare; we had this when transitioning to C++11 when the old compilers could not yet understand move constructors. We only provided those for C++11 compatbile compilers. Since there were copy constructurs as well, our library ran under C++98 as well, but sometimes a bit slower.

Now back to Python: Here, we can conditionally import, conditionally define functions and conditionally call functions using standard if constructs. Name lookup only happens for the parts that are actually executed. The only constructs that cannot not hidden in an if block are those that contain a new incompatible syntax. You say “Many syntax can conflict in different versions”, but this is generally not true in Python. The syntax does not change that much.

  • Python 3.10 introduced pattern matching. But if I wanted my code to also run under 3.9 I would do everything with if statements. I would never choose to maintain two code paths depending on the version needing to be kept in sync.
  • In Python 3.14 you can catch multiple exceptions without using parenthesis. If I want to run the code also in 3.13, I would just use parenthesis.
  • Python 3.15 likely gets lazy imports. If I want my code to also run under 3.14 (but without lazy importing my libraries), I just write __lazy_modules__ = ["json"]; import json instead of lazy import json.
  1. Can you ensure that these codes have similar performance?
  2. This is wrong:

In 3.14:

While in 3.15:

Finally, to be clear: this is a tooling and distribution problem, not a Python language design proposal.

No, this is not wrong! It is exacatly as I wrote, the code __lazy_modules__ = ["json"]; import json will do a lazy import in Python 3.15 (as specified in PEP 810) and it will do a normal import in Python 3.14. This is the best as one can get since the new lazy imports are not present in 3.14. Also your preprocessor cannot change that. But since there is __lazy_imports__, I do not have to use your preprocessor to write code like

$ version >= 3.15
lazy import json
$ _
import json
$

Yes, I know.

Still this question: can you ensure that the compatible way has similar performance with the new syntax way?

If the module wants more performance in new python (or best in all supported versions), the “pyv” way can make it.

1 Like

Ok. Add in 1.0.2:

pyv.witer_pyv_to_py(source: str, filename: str, env=None) -> None: ...
pyv.witer_pyv_file_to_py(path: str, file_name: str, env=None) -> None: ...

Good stuff. I think simply writing a .py file will prove valuable for you, when you add some more rigorous tests with assert statements (that go beyond the current smoke test).

Regarding performance, I suppose I could make it lazy, but I just put all the platform and version specific patches such as:

if sys.version_info >= (3, 14):

at the top of each module that needs it. Or even group common ones in a special wrapper module. Certainly never in a hot loop. So it only runs once on import.

Who’s writing code that checks the Python version, or if the platform is Windows, thousands of times per second (is it going to change?! :wink: ), and for those that do, will pre-processing away those checks to remove an if statement, really give much of a boost?

Improved performance is a non-goal. I wouldn’t make the pitch about that.

1 Like

Not only “if”, just the new syntax can also bring it.

Moreover, some old syntax can fix some problem:

def how_high(obj):
    if obj.height < 500:
        return "low"
    elif 500 <= obj.height <= 1000:
        return "middle"
    elif obj.height > 1000:
        return "high"
    else:
        return "invalid height" 

and

def how_high(obj):
    match obj.height:
        case h if h < 500:
            return "low"
        case h if 500 <= h <= 1000:
            return "middle"
        case h if h > 1000:
            return "high"
        case _:
            return "invalid height"

If we use “if-elif-else”, the value need to be read in any condition evaluating, but if the value is not threading safe (when evaluating “obj.height < 500” it is 1000, while 400 when evaluating the next condition), the result is “invalid height”, though the height is valid always. But in “match-case” the value is fixed. This can promoted to many other object (for example: the value itself has a complex getattribute method). It means that sometimes the new syntax is not just syntactic sugar.

For python, now a new syntax needs many adventages to be received (easy to understand, faster to run the code than original way), so just the new syntax itself can give much of a boost.

I’m pretty sure the CPython core devs have at the very least considered caching obj.height, and probably have implemented this already. But feel free to run dis on those examples, and prove me wrong.

[Edit] I suppose it could be an expensive @property, not a simple name reference. Nonetheless, I don’t believe claims about performance benefits without benchmarks.

Here is the “dis.dis” on two function:

  1           RESUME                   0

  2           LOAD_FAST_BORROW         0 (obj)
              LOAD_ATTR                0 (height)
              LOAD_CONST               0 (500)
              COMPARE_OP              18 (bool(<))
              POP_JUMP_IF_FALSE        3 (to L1)
              NOT_TAKEN

  3           LOAD_CONST               1 ('low')
              RETURN_VALUE

  4   L1:     LOAD_CONST               0 (500)
              LOAD_FAST_BORROW         0 (obj)
              LOAD_ATTR                0 (height)
              SWAP                     2
              COPY                     2
              COMPARE_OP              58 (bool(<=))
              POP_JUMP_IF_FALSE       10 (to L2)
              NOT_TAKEN
              LOAD_CONST               2 (1000)
              COMPARE_OP              58 (bool(<=))
              POP_JUMP_IF_FALSE        5 (to L3)
              NOT_TAKEN
              NOP

  5           LOAD_CONST               3 ('middle')
              RETURN_VALUE

  4   L2:     POP_TOP

  6   L3:     LOAD_FAST_BORROW         0 (obj)
              LOAD_ATTR                0 (height)
              LOAD_CONST               2 (1000)
              COMPARE_OP             148 (bool(>))
              POP_JUMP_IF_FALSE        3 (to L4)
              NOT_TAKEN

  7           LOAD_CONST               4 ('high')
              RETURN_VALUE

  9   L4:     LOAD_CONST               5 ('invalid height')
              RETURN_VALUE

and

  1           RESUME                   0

  2           LOAD_FAST_BORROW         0 (obj)
              LOAD_ATTR                0 (height)

  3           COPY                     1
              STORE_FAST_LOAD_FAST    17 (h, h)
              LOAD_CONST               0 (500)
              COMPARE_OP              18 (bool(<))
              POP_JUMP_IF_FALSE        4 (to L1)
              NOT_TAKEN

  4           POP_TOP
              LOAD_CONST               1 ('low')
              RETURN_VALUE

  5   L1:     COPY                     1
              STORE_FAST               1 (h)
              LOAD_CONST               0 (500)
              LOAD_FAST                1 (h)
              SWAP                     2
              COPY                     2
              COMPARE_OP              58 (bool(<=))
              POP_JUMP_IF_FALSE       10 (to L2)
              NOT_TAKEN
              LOAD_CONST               2 (1000)
              COMPARE_OP              58 (bool(<=))
              POP_JUMP_IF_FALSE        5 (to L3)
              NOT_TAKEN
              POP_TOP

  6           LOAD_CONST               3 ('middle')
              RETURN_VALUE

  5   L2:     POP_TOP

  7   L3:     STORE_FAST_LOAD_FAST    17 (h, h)
              LOAD_CONST               2 (1000)
              COMPARE_OP             148 (bool(>))
              POP_JUMP_IF_FALSE        3 (to L4)
              NOT_TAKEN

  8           LOAD_CONST               4 ('high')
              RETURN_VALUE

  9   L4:     NOP

 10           LOAD_CONST               5 ('invalid height')
              RETURN_VALUE

In the “if-else” implement, the LOAD_ATTR appear three times. While in the “match-case” implement, it only appear once.

In fact, the data in syntax PEP can provide the conclusion. I will try to analyse the other third party modules to work out whether it can provide the performance (or the secutity) in higher version.

Great stuff - thanks. I stand corrected.

The execution time of both versions should still be measured to prove your point though. Hitherto, I didn’t consider the benefit of match statements, and all the pattern matching wizardry they do under the hood, to be increased speed.

You can simply assign the evaluation to a name before repeatedly reusing it to avoid such a performance penalty while making it compatible with older Python versions:

def how_high(obj):
    h = obj.height
    if h < 500:
        return "low"
    elif 500 <= h <= 1000:
        return "middle"
    elif h > 1000:
        return "high"
    else:
        return "invalid height" 

With the preprocessor directives you propose, you’re writing the same logics twice in the same code, cluttering up the code and violating the DRY principle, making it harder to both read and maintain.

I therefore don’t think it’s worth the largely negligible performance gain such directives bring at the cost of readability and maintainability when we can simply write equivalent codes with syntaxes supported by the minimum Python version we’re targetting.

2 Likes

In this “if-else” code, the “h” only assign once, which means that if the function’s other place use another threading to change “h”, it is still not safe. But in “match-case” code, in different case “h” are independent.

So, as I say, you can’t write the code with old syntax and new syntax with absolutely same effect. In new version, use new syntax can bring more benefit, and in old version use old syntax to keep it valid. This is what “pyv” do.