As much as I want to see this functionality, I’m not trying to propose that at this point. However, I think there’s a couple relevant points that should be considered if it is brought up again. So it made sense to me to tack this on the end of this thread, rather than starting a new one.(there’s already too much of a disconnect between discuss and python-ideas)
(looking back at the thread, I notice that @gwerbin made a number of similar points, but this is a higher-level summary in one place)
-
Why would we do a new API? because if something is going to be in the stdlib, it makes more sense for it to have an compatible API to other similar functionality in the stdlib than an arbitrary other API. And the toml API is already (at the top level) almost the same as the json one. Though under the hood, it’s pretty different, which is a mistake if you ask me.
-
We should keep in mind that the stdlib json API was not designed for json – it is a mirror of the pre existing pickle API. And I’m sure that was done very much on purpose. But in fact, the needs of pickle are different than JSON, which is probably why it didn’t have a “load from a path” function in the first place.
-
There are also implementation issues: the toml lib referenced above uses type checking to overload the load function. The json lib doesn’t overload, but it also doesn’t do any type checking – using simple duck typing instead. And it certainly could overload using duck typing: check for a read() method, and if that raises an AttributeError, then try to use it to open a file. (in fact, I prototyped this a while ago, after the python-ideas thread, but didn’t finish it. (cpython/Lib/json/__init__.py at json_file · PythonCHB/cpython · GitHub)
If this does get revived, maybe the way to go is a PEP about formalizing a “serialization” API – which would then be used by pickle and json and any other new format (e.g. toml). And optionally used by third party libs. That’s already what’s happening, but formalizing it would be good, and that would be the time to make any changes, if any.
Finally, something really struck me in reading the objections to adding the ability to read JSON directly from a path in one call. And that is that most of the objections came from the perspective of “systems programming” as opposed to “scripting” (to probably incorrectly use Ousterhout’s terms). Python is an excellent scripting language. But most of the changes in recent years (except f-strings) have been aimed at making it a better systems language, some at the expense of scripting. In this case:
“developers need to understand encoding issues anyway”, well, yes, but do non-developers writing simple scripts? I don’t think so.
"We need to educate folks about the “best practice”:
with open(path, "b") as f:
json.load(f) # Implies encoding="utf-8".
Do we really think that folks writing scripts should have to learn to do that, when we could offer them:
json.load(path)
“From my experience, in most cases you load or save JSON not from file, but from network, or database, or GzipFile, or ZipFile, or TemporaryFile, etc”
That is VERY much systems programming experience – people writing scripts are most often going to load from files. And most importantly, no one is suggesting limiting in ANY WAY the ability to load from file-like objects rather than paths.
I really like the maxim: “the easy things should be easy, the hard things should be possible”
I know to core developers that the current situation isn’t “hard”, but it’s not so easy to non-software-developers writing scripts. Do we want Python to be an even better scripting language?