PEP 741: Python Configuration C API (second version)

Intuitively, if I’m configuring CPython myself, it’s unlikely that I would like to pass my own argc, argv to Python. After all, these are arguments to my application, not to the Python interpreter.

It’s theoretical in the sense that an embedded Python interpreter will typically not get its configuration from the user’s environment nor from command line arguments. It will get a more or less hard-coded configuration driven by the embedding application’s own semantics.

2 Likes

Here is a major update of the PEP 741 to address Steering Council’s review: PEP 741: Address Steering Council's review by vstinner · Pull Request #3789 · python/peps · GitHub

  • Remove string types other than UTF-8.
  • Exclude the API from the limited C API.
  • Remove the explicit preconfiguration.
  • Remove the rationale about the limited C API / stable ABI.

It should also address most concerns of @steve.dower and @pitrou.

2 Likes

I merged my PR updating the PEP.

@steve.dower and @pitrou: Are you ok with the updated PEP?

The updated proposal is fine to me. I haven’t done a full review of the PEP text.

This might be a good opportunity to make some statement about the intent for the PEP 587 API. Currently, 741 just says that the 587 API is not changed, but doesn’t say whether future additions should/must be added to both. (Personally I think it’s fine to freeze the 587 API now, discourage its use, warn that behaviour may change in new releases[1], but don’t schedule removal.)


  1. e.g. if we one day make preinit not change global process settings, or make it change different global settings. ↩︎

It sounds like the definition of soft deprecation: Glossary — Python 3.14.0a0 documentation

A soft deprecation can be used when using an API which should no longer be used to write new code, but it remains safe to continue using it in existing code.

I don’t expect additions to PEP 587 PyConfig API. But I’m not sure that it’s needed to strictly freeze the API. For example, the PyConfig structure will continue to evolve.

Are there Linux systems actually in use with latin1 locale encodings or such?

On Linux, there is precedent for treating locale encodings as just a mistake in the history of computing, ignoring them and just using UTF-8. For example, Rust will always decode argv in UTF-8:

$ cat argecho.py 
import sys

print([hex(b) for b in sys.argv[1].encode("utf8")])


$ # Terminal sends é in UTF-8 (0xC3 0xA9), Python decodes it in latin1, giving é (0xC3 0x83 0xC2 0xA9 in UTF-8).
$ LC_ALL=sv_SE.iso88591 python argecho.py é
['0xc3', '0x83', '0xc2', '0xa9']


$ cat src/main.rs 
fn main() {
    let arg: String = std::env::args().nth(1).unwrap();
    println!("{:x?}", arg.as_bytes());
}


$ # Terminal sends é in UTF-8 (0xC3 0xA9), Rust reads it as UTF-8 in spite of the locale.
$ LC_ALL=sv_SE.iso88591 cargo run é
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.00s
     Running `target/debug/foo 'é'`
[c3, a9]

Likewise, I just tried creating a folder named é in the GNOME file manager launched in this latin1 locale (LC_ALL=sv_SE.iso88591 nautilus), and it was created with the name as UTF-8. The environment variable to ask for interpreting file names in the locale encoding is called G_BROKEN_FILENAMES :slight_smile:

2 Likes

Mostly by accident, but there are a number of ways you can end up in a process that isn’t correctly set up. The locale coercion we added some years ago deals with it, but in a way that assumes that Python controls the entire process (i.e. it changes process-global state).

If you have an app that already successfully parses the command line or environment, you won’t want Python coming in and changing your settings. This is why I’ve been arguing that the embedding API should leave them alone and assume the host app has sorted it out already.

3 Likes

I think the edits do a decent job of cutting the PEP down to just the bits that everyone has agreed are a good (or at least reasonable) idea, but I’m still not entirely clear on the way the new API is expected to interact with the PEP 587 preconfiguration API.

I don’t think the note in the abstract about “This PEP unifies also the
configuration of the Python preinitialization and the Python
initialization in a single API.” is true anymore (from what I can see in the PEP use_environment is the only preconfiguration field that can be set directly).

The values of the other preconfiguration fields are dictated by the choice of creating a Python configuration or an Isolated configuration (while the parallels with PEP 587 are useful, I do wonder if it might be better to give these names like PyInitConfig_AsPythonCLI and PyInitConfig_AsEmbeddedRuntime to more clearly convey when an embedder will want one base config set over the other. Embedders aren’t likely to know off the top of their heads if they want a “Python” config or an “Isolated” config, but they should know if they’re emulating the CPython CLI or embedding a CPython runtime in a larger application without exposing its CLI features).

However, what happens if the embedding app calls Py_PreInitialize before calling either of the PyInitConfig_* creation functions? My view is that this should work, with those pre-init settings being retained for use in the PEP 741 configuration. The preinit struct is far more stable than the main config struct, and it doesn’t have any strings in it, so it’s just generally much less of a hassle to deal with. For most purposes “Python CLI or embedded runtime?” will cover any preinit configuration needed, and for those rare cases where more control is needed, the PEP 587 preinit API isn’t currently planned to go anywhere.

I also feel like “potentially compatible with the stable ABI” is worth mentioning in the rationale (the abstract still mentions it, after all). Having the option available is a genuine benefit of the updated design, even if we don’t think it’s worth actually following through on the possibility in the near term.

The omission of the Locale and WStr APIs feels like it should go in a “Deferred Ideas” section rather than being omitted entirely or outright rejected. Those functions have clear utility in some embedding use cases, the debate is around whether it’s CPython’s responsibility to offer solutions for dealing with them. PEP 587 was heavily influenced by CPython’s own needs, hence settling on wstr lists as the most straightforward way to build a system that could readily handle data being pulled from either *nix APIs with dubious encodings or from Windows wstr APIs, while PEP 741 is more focused on embedding use cases where all of the configuration data is coming directly from the embedding application itself.

It does raise the question that if the Locale and WStr APIs are being omitted, perhaps PyInitConfig_CreatePython should also be excluded? Emulating the CPython CLI is an intrinsically higher coupling activity than just embedding an isolated CPython runtime for use within the embedding application, so it actually seems reasonable to me to leave that use case to the lower level PEP 587 APIs (at least initially, anyway).

Yeah, PEP 538 (locale coercion) and PEP 540 (UTF-8 mode) are still our most comprehensive write-ups of how much of a mess locale-based systems can get into when their locale isn’t configured correctly (it can get especially fun when a client’s locale config gets incorrectly forwarded to a server via SSH environment forwarding).

Between those two PEPs, modern CPython is now one of the most capable runtimes in the world when it comes to handling locales correctly. They’re still fundamentally flawed, though (hence the problem still coming up in this discussion despite those PEPs)

2 Likes

Agreed. I’d like to see the CPython CLI separated into its own API set anyway that can update a standard config (and then that API set can just be implemented in python.c and copied from there if someone wants to emulate our CLI for their own app).

1 Like

PyInitConfig_SetInt() can set all preconfiguration options. PyConfig_Set() cannot set them, but this API is to set current options at runtime, it’s different. Once Python is preinitialized, PyConfig_Set() cannot set preconfiguration options anymore.

If Python is already preconfigured, the PyInitConfig options related to the preconfiguration should be ignored. Python can be preconfigured exactly once. For example, you cannot preconfigure Python without changing the locale, and then ask Python to change the locale. The second attempt is silently ignored.

Ok, I will add it.

I agree, I will remove it.


I prepared PR PEP 741: Update by vstinner · Pull Request #3800 · python/peps · GitHub to address @steve.dower’s and @ncoghlan’s review.

1 Like

This also means that the name of PyInitConfig_CreateIsolated can be simplified to just PyInitConfig_Create. The fact its default settings align with PEP 587s isolated config can go in the function documentation.

1 Like

That’s what I did in my PR.

1 Like

I addressed most, if not all, concerns of the Steering Council, @steve.dower, @pitrou and @ncoghlan. I plan to submit the updated PEP 741 to the Steering Council next week, unless there are more concerns.

Summarising the general impact of the latest PEP update:

  • the new PEP 741 API becomes exclusively about the “embed a Python runtime in a larger application” use case. The only config preset offered uses the settings from the “isolated” config. Preconfiguration settings can still be adjusted via the new API, but preconfiguration and configuration all happen in one step (unless you drop down to the lower level PEP 587 APIs)
  • the lower level PEP 587 API remains the preferred way to handle the “emulate the full CPython CLI” use case, since it plays nicer when you’re actually having to pull info from operating system APIs before initializing the runtime. Using the PEP 587 API to trigger preconfiguration early locks the preconfig settings, even if you use the new PEP 741 API to finish the config process.
2 Likes

I submitted PEP 741 to the Steering Council.

2 Likes

The PEP says:

Previously, it was possible to set directly global configuration variables:

Py_OptimizeFlag
Py_VerboseFlag
Py_DebugFlag
Py_InspectFlag
Py_DontWriteBytecodeFlag

But these configuration flags were deprecated in Python 3.12 and are scheduled for removal in Python 3.14.

I see these were deprecated in #93103 (3.12) and the removal schedule was set in #106538 (3.13), but I could not find a reason to use the shortest deprecation period allowed by PEP 387.
Note PEP 387 says:

If the deprecated feature is replaced by a new one, it should generally be removed only after the last Python version without the new feature reaches end of support.

If these need a replacement but we’re only adding it now, shouldn’t the removals be postponed?

1 Like

PEP 587 also allows these to be configured (and then queried via the Python APIs), so they’ve technically had a replacement available since Python 3.8.

PEP 741 provides a potentially simpler replacement than those existing replacement options, but not the only replacement option.

(I guess they haven’t previously had a replacement for changing them after initialization, but I’d consider that an argument in favour of PEP 741, rather than an argument to delay removing the public global configuration flags)

1 Like

PEP 587 provides a new complete solution for the Python initialization and PyConfig also became the reference to get the current runtime configuration. Setting these global configuration variables (such as Py_OptimizeFlag) has no effect anymore after the Python initialization (at “runtime”). Problem: there is no convenient API to get the PyConfig at runtime. That’s why I wrote PEP 741 to fill this missing API: add PyConfig_Get().

The removal is a yet another reminder that these global configuration variables are now ignored.

The whole problem is more complicated. To get the current runtime configuration, you can already / you should read the “public Python API”. PEP 741 gives a mapping, see for example: Configuration Options. Only the rare use case of setting the current configuration after the Python initialization is no longer supported. PEP 741 also adds PyConfig_Set() to set the current configuration, but only options of this table: Public configuration options.

I’m not convinced that it’s a good idea to modify these “read-only” configuration options. As I wrote in past discussions, it’s better to set these options during the initialization. But well, it was possible previously, so PEP 741 adds the feature back.

2 Likes

Perhaps the removal schedule should be discussed elsewhere, but, #93103 (comments) send me here.

Currently, the PEP’s rationale says “these configuration flags were deprecated in Python 3.12 and are scheduled for removal in Python 3.14.” The Abstract says “This new API replaces the deprecated and incomplete legacy API which is scheduled for removal between Python 3.13 and Python 3.15.” That sounds like it’s a done deal. But if this topic is the place to discuss the schedule, I’d expect the schedule in the specification section.

IMO, deprecation should generally be done after a replacement is added. And if the added replacement doesn’t cover all that’s needed, the deprecation should be undone. Or restarted with a newer replacement.

1 Like