They are just combinations of two functions. In best case it would save you one line of code at the cost of larger maintaining burden and more difficult learning. In any case you could not use them until you drop support of 3.9.
Since they are combination of two functions, they should support the union of all arguments of underlying functions. open() and loads()/dumps() have too many parameters.
From my experience, in most cases you load or save JSON not from file, but from network, or database, or GzipFile, or ZipFile, or TemporaryFile, etc. There will be too litle use of these specialized functions.
Since there are several stdlib and third-party serialization modules which support similar interface (load()/loads()/dump()/dumps()) we will need to add new functions in modules marshal, pickle, plistlib and add a burden for adding them in third-party libraries.
I don’t feel as strongly as @storchaka (and I did in fact vote for a name) but I feel that the case for adding these functions is weak regardless of what they are named.
This post seems to assume that the case for having such functions is decided. Maybe it is, but a link to a clear statement of the consensus would be useful in that case (I’m not going to re-read the whole thread). Also, the proposed implementation
with open(filename, "r") as fp:
data = json.load(fp, *args, **kwargs)
ignores any question of encoding (are JSON files required to be UTF-8? because the default encoding for openisn’t necessarily UTF-8).
So if there is a consensus that the functions should be added, and if the handling of encodings is properly defined, then my vote on what to call the functions stands. But IMO, at this point we’re a long way from the point where the name is the biggest outstanding question, here…
Please keep this thread for just naming vote. If you have topics not discussed in the previous discussion, please reply to the thread or create a new thread.
I also concur with Serhiy’s points and am a strong -1 on this. Even though stuff like this keeping gets requested, their use case is too little in real-world applications.
I am so strong against this because it lowers a bar for many similar propositions. Reading a content of plain file from file name, reading a content of compressed file from file name, loading JSON from compressed file by file name, reading a content from downloaded from Web, reading CSV from compressed file downloaded from Web,… It is a PHP way.
It is not hard to write two lines of code. It is more explicit and flexible. You can easy modify the code to load JSON from compressed file, or from network, or from database field, or to load multiple JSONs from the same stream (newline separated).
“more flexible” means “more easy to do wrong”. Especially, since default text encoding is not UTF-8, people will create bugs by omitting encoding= parameter.
I think all “UTF-8 should be used” modules should have similar functions. (e.g. toml, yaml, xml, …).
And I think “binary mode should be used” modules can have similar functions too, although I think you will strong -1 about it too.
If we don’t add such “easy” functions, I think we must hurry about changing the default encoding of open(). It is too easy to make a mistake. See this issue for example.
There are other ways to “fix” encoding issues without introducing new functions as well. For example, json.load() can gain a new argument encoding="utf-8" that only works with binary streams, and this becomes a best practice issue:
# Don't do this.
with open(path): as f:
json.load(f)
# Do this instead.
with open(path, "b") as f:
json.load(f) # Implies encoding="utf-8".
There is not much point in adding lots of new helpers which save
you one or two lines to many different data format modules, just
so that people don’t forget to specify an encoding in the open()
call which is used for opening the file.
A solution such as the one mentioned by Greg Werbin on the ideas
ML would be a better solution:
Alternatively, the format modules could check the file object’s
.encoding attribute and raise a warning if a non-standard encoding
is found, e.g. the json module could check for “utf-8”.
Regardless of what we do in Python to help users with file encodings,
programmers will have to learn about these one way or another, since
the world is not perfect and we’re still not quite where we’d like
to be with text files - although things are already a lot better
than 10 years ago
E.g. it’s still not uncommon to have CSV files encoded in
Windows code page encodings.
And we need to first figure out why people don’t follow the best practice. Otherwise, even if a new function is introduced, it is very like people will still ignore it and reach for the wrong solution.
One obvious reason is the “best practice” is tightly coupled with its implementation. It is almost leaking implementation detail.
Current JSON library supports bytes input so opening with binary mode is the best practice. But before JSON supports binary input, there was no “best” practice. encoding="utf-8" or encoding="utf-8-sig" might be used.
When users need to use JSON, YAML, TOML, csv, etc…, user need to check “should I open file with binary mode, or specify encoding? If both are OK, which is more efficient?”
If all modules supports “module.load_path(path)”, it provides the most efficient and recommended way. Modules can hide implementation detail.
It is not better solution because it doesn’t hide the implementation detail; use binary mode or specify encoding. module.load_path() can chose the most efficient way. load_file() can not.