Amending PEP 517 with a sentence on handling duplicate config-settings keys

The config-settings section of PEP 517 has an example that suggests duplicate keys should be merged into a config_settings dict, so that this command:

pip install . --config-settings=key1=value1 --config-settings=key1=value2

results in a dict containing key1: [value1, value2]. The PEP 517 text doesn’t explicitly say this though. And it turns out pypa/build implements it this way, while pip does not - instead, it silently swallows the first value and returns key1: value2.

After discussion in `--config-settings` recognizes only the last option · Issue #11681 · pypa/pip · GitHub, the pip maintainers are happy to update pip to match what pypa/build does here, and consider the old behavior a bug. @pf_moore asked to update PEP 517 for this, and suggested posting here to ensure backend authors are happy with this. The proposed edit to PEP 517 is (part in italics is added):

“Build frontends SHOULD provide some mechanism for users to specify arbitrary string-key/string-value pairs to be placed in this dictionary. In case a user provides duplicate string-key’s, build frontends SHOULD combine the corresponding string-value’s into a list of string-values. For example …”

Various other options were discussed in the pip issue. This one seems best because it’s unambiguous (trying to split key1='value1 value2' is not reliable), and backend authors don’t have to do anything if they don’t have any need for duplicate keys.

Unless there’s an objection, I’ll open a PR with this one line amendment to PEP 517 soon (EDIT: done after 4 days: python/peps/pull/2996).

5 Likes

Another question, which side (frontend/backend) should .strip() the string-key?

This command has a space before =, setuptools does’t recognize it as --build-option
pip wheel --config-settings="--build-option =--cffi" -v .

Neither, in the sense that the spec says nothing about that. IMO this is more of a UX issue (a trade-off between robustness in the face of an easy typo to make and the ability to support config options with leading/trailing whitespace).

I’d say just raise a feature request on pip, it’s not a matter for the standard to cover.

Do I understand this correctly that it basically says that if a key is used once, the value is placed in dict as a str, whereas if it occurs more than once it becomes a list[str]?

Yes, that’s correct. It’s a SHOULD, not a MUST, so it’s not required, but that’s the intended “good practice”.

In hindsight it would have been better if the interface was always list[str] even for a single entry, but we can’t do that for backwards compatibility reasons. So one value: str, >1 value: list[str] it is.

Agreed, that is too detailed for the spec - in general, how to deal with user errors (whether typos or whatever else) is very hard to specify, because there are too many ways in which errors can be made.

Not just backward compatibility. Unless you want to change the actual spec for config_settings, requiring all backends to use a list even for a single-value setting, making it impossible to pass simple strings from the frontend is a big omission.

If that’s of any value, back when I’ve written gpep517, I’ve decided not to implement the pep-like interface and instead support passing config_settings dict as JSON.

The amended text only really makes sense for key-value pairs provided as the value of a CLI option. I’m not sure if JSON with duplicate keys is valid, but Python will discard duplicates and only keep the last entry (I assume by virtue of casting the object to a dict). The PEP also has the following example of a CLI option where the key is the option name itself:

--global-option="--some-global-option" \

This is placed in a single-item list:

"--global-option": ["--some-global-option"],

… which, although contrary to the recommendation of the PEP, is how multiple options are parsed by optparse and its successors.

That seems like a valid thing to do according to how the PEP is written, but it’s probably only useful for a particular combination of frontend and backend that you control?

The config_settings interface being so loosely defined is a bit problematic when one wants to combine arbitrary frontends and backends. There were reasons at the time though to write the PEP the way it was written - so my goal here was much more constrained: clarify an implicit recommendation - and then align what pypa/build and pip do, to make things less troublesome in practice with the two most popular frontends.

Same answer as above I think. It’s not clear to me if you are responding to @mgorny, pointing out a problem with the one added sentence, or are pointing out further issues in the PEP?

I can’t imagine how you’d be able to use config_settings more generically, given that there are no “standard” values and every backend can define their own (and throw an error on unsupported keys).

1 Like

I’m pointing out that the added sentence is applicable to only one very specific input format. IMO, the PEP should recommend that frontends provide a way to specify values both as strings and as a multi-element list of strings and leave it there. Whether that involves duplication of keys or not is up to the frontend. A frontend which is neither pip nor build could use JSON, or it could define a config settings mini-DSL, or it could allow passing config settings as regular options - who knows?

But I don’t think it’s such a big deal and if we don’t wanna waste any more time on this that’s ok with me too :slight_smile:

The hope when the PEP was written was that backends would agree between themselves on how they would use config_settings. In hindsight, I guess that was naïve :slightly_frowning_face:

I think if we start to get too deep into recommending frontend UI details, we’ll never stop debating. While I see the benefit of having pip and build work the same, I’m honestly mildly uncomfortable over even what we have now. If there hadn’t already been an example in the PEP that didn’t work like pip ended up implementing things, I would have said we should just leave it out.

Yep, that’s what I think too. This is good enough, let’s not waste any more of people’s time on it. It’s only an example, after all.

Kinda, maybe. I agree with you that UX details are too low-level to exhaustively specify. However, it is really quite painful if every frontend implements things slightly differently, for no real good reason. I’d hope that we can at least articulate guidance that a common UX is valuable and that existing and new tools should aim for interface alignment wherever possible. Only deviate if it’s a substantial improvement. And at that point, the older tool(s) may want to take over that improvement over time.

It may still happen, but I think the bigger issue is that some config_settings are unlikely to apply to all packages involved in an installation. So you end up needing a syntax for “when installing package X, use this config, otherwise ignore it”.

We started approaching some agreement on cross-compiling settings, which ought to apply to all packages with native content, but then because pure-Python only backends would throw on them it wouldn’t actually help. Best to use the backend directly for that package to build a wheel, and then only install from wheels.

This sounds to me like a similar argument to the one that started this, which is that build and pip should have the same UI, because it helps users if things are consistent. Maybe backends should agree a consistent basis for handling config settings (even something as simple as “ignore stuff you don’t recognise” and/or “keys of the form backend:key_name are reserved for the named backend” might be enough to start with)?

Should there be a PEP stating how backends deal with config_settings? We’re back to the “should PEPs dictate UX?” question again. Personally, I’d rather tools sort it out between themselves without needing the PEP process - we have PEPs as a last resort, but let’s not over-use them in case tools start to notice that ignoring PEPs they don’t like is possible…

The argument against ignoring unrecognised keys is that you may produce something the user didn’t want, when they really needed to know that you don’t support that option. Which is fine if you assume that people only ever do pip wheel --no-deps ..., but in that case I typically advise using the backend’s own interface directly (in part because it’s possible to set configuration settings, so I guess that could change – but then, to know what settings you can pass means knowing which backend is being used, so it’s not like PEP 517 is hiding any details from you here anyway).

I don’t think that was explicitly recorded in the PEP, but I’m sure we removed wording about how to deal with unrecognised settings because we didn’t find a compromise. Perhaps with a bit more experience now, it’d be enough to amend PEP 517 to say (essentially) “use a backend-specific prefix for any new settings you create, and ignore settings that don’t use your prefix”?

Like with the current change, if that’s the consensus backend view, I’m fine with it being a text-only change.

1 Like

I’m not sure this is a good thing to do, it’s going to make an already cumbersome UX a lot more verbose. Here is what my most-often used config-setting usage looks like now (after the change that just landed in PEP 517) for pip and build:

python -m build -Csetup-args=-Dblas=blas -Csetup-args=-Dlapack=lapack
python -m pip --config-settings=setup-args=-Dblas=blas --config-settings=setup-args=-Dlapack=lapack

Changing the setup-args there to meson-python.setup-args would look really bad. Also, these options are not backend-specific but project-specific, so another project which also uses meson-python would still choke on them.

I’ll note that meson-python does not have an interface aside from the pyproject.toml hooks. And using Meson’s interface to directly install your project doesn’t give you a wheel or dist-info. So in (the unusual) case that multiple wheels need to be built at once and there’s conflicting content in config-settings, I’d prefer telling folks to build wheels one by one instead, with --no-deps et al.

So for context, when I added config_settings to PEP 517, what I was imagining was:

  • Frontends would provide some relatively flexible way to stick data in there, maybe some convenient shorthands for individual keys + arbitrary JSON blobs, or whatever combination of things feels most ergonomic for the frontend’s context.
  • Backends would define whatever interpretations made sense… hard to predict what backends would need, but whatever it is can probably be expressed with simple data structures like strings and lists and dicts.
  • Setuptools would pick some arbitrary convention for how to decode legacy setup.py command line options in config_settings
  • Pip would declare that as part of its “convenient shorthands” for filling in config_settings, when running in PEP 517 mode, it would map the existing --global-option flag to the same config_settings value that setuptools picked, so --global-option continued to work seamlessly as we transition to PEP 517.

So that last thing is what the "--global-option": ["--some-global-option"] example was gesturing at: maybe setuptools declares that it will check the "--global-option" key for a list of options to pass to setup.py, and pip declares that its existing --global-option will fill in that field in the format setuptools expects. So it’s a list because setup.py takes a list of arguments, and --global-options lets you pass a list of arguments, so a list just seemed like the natural data structure there.

I don’t think it’s important for the PEP to legislate exactly how the frontend UI maps to config_settings, but lists are so useful for build configuration that frontends probably should allow some way to pass a list of strings, at the least.

2 Likes