OT: Irrespective of f-strings, I really dislike the style of calling .join() on a string literal. I remember ages ago when we were debating how to add a string join function, there were arguments to making it either a built-in or a string function. String function won of course, and it does make sense, but I think it was @tim.one who suggested and preferred calling it on a variable instead of a string literal, i.e.
NL = '\n'
...
message = NL.join(bits_and_pieces)
So much more readable to my eyes, and even more so when the string literal has a semantic meaning which you want to convey in the code.
I totally get the convenience (and popularity) of using '\n'.join() but I still donât like it so I donât use it, which is probably why the âbackslash restrictionâ in f-strings doesnât bother me in practice that often.
FWIW, and I know this is going even further OT, the popularity/prevalence ends up mattering a lot when making small contributions on projects.
If I show up on a project with a couple dozen open PRs and want to contribute a change, Iâm definitely going to write "\n".join()!
Even if there were a stdlib method or constant for me to use, like string.NL.join(), I donât want to bog down a contribution with extra discussion. The most common way of writing it becomes normative (except when the common way is incorrect, of course).
Hm. Before we decide to endow the str class with a bunch of random attributes letâs think some more about whether thatâs the right place. And even with the names you suggest, if someone encounters str.SPACE.join(...) theyâll probably have to look it up the first time to be sure what kind of magic it does.
FWIW, I personally prefer literals, e.g. ' '.join(...).
It might be better to split this discussion, but I think only Discourse admins can do that? @brettcannon ?
That said, I donât think str.SPACE.join() (and friends) would be all that confusing. Yeah, maybe they have to look it up the first time, but once you know that SPACE is just a string, it â and all other such constants â should be obvious.
The benefit of sticking them on str is that because itâs a built-in, no imports are necessary. If they arenât put on str Iâm not sure what would be better and more obvious.
While I would love to contribute to cpython, I donât feel strongly about the feature. I was just asking out of curiosity, thought it had been brought up before and I wanted to know the reason it was rejectedâŚ
I also looked at the str/unicode object source code and it looked very complex, especially for someone whoâs never written a python object in C before.
Or more realistically, float.PI, float.E, float.TAU, float.INF and float.NAN.
Even if thereâs no intention from the core devs to establish a principle that âcommon constants for a type should be attributes of that typeâ I fully expect that if we do this for str, weâll end up with a lot of energy spent on python-ideas arguing with people who feel that you can never have too much of a good thing
Okay, maybe, but even if so, is it 1) a bad idea to add constants such as float.PI and 2) even if there is some call for that, is that a reason not to do it for str constants?
The names for the str constants will, generally be longer than the
literal, so this seems to be a foolish endeavor, taking up extra
characters in the code with by spelling out the constant, and having
more names (should they be in English or Tamil?) to need to learn and
remember.
float.PI and float.E sound much more interesting that str.SP or str.NL,
although they can only be approximated, whereas str constants could be
exact.
Yes, most of the ASCII control characters have short abbreviations that
were standardized by ASCII, but when you have to prefix them with âstr.â
they are longer than the literals, even than the hex literals â\xA0â and
certainly longer than â\nâ or â '. Unicode literals have far longer
names, in generally, so again the literal is simpler, shorter, and
doesnât require reference to a document to know what is meant. There
are a few characters with similar appearance, but my favorite text
editor will tell me the hex code and the Unicode name, if Iâm uncertain.
Thatâs fine. You can always use the literal if youâre indexing on saving characters. I still think using symbolic names instead improves readability in many cases.
using symbolic names instead improves readability in many cases
and decreases it in others:
(trying, and probably failing, to pretend I donât have decades of experience with some of these formsâŚ)
str.NL vs '\n' -- neutral (edge to '\n' for "already programmers" crowd)
str.NEWLINE vs '\n' -- edge to str.NEWLINE
str.SP vs ' ' -- edge to ' '
str.SPACE vs ' ' -- edge to str.SPACE
str.EMPTY vs '' -- neutral
str.COMMASPACE vs ', ' -- edge to ', '
Note that the (IMHO) more readable ones are harder to type â maybe a trade off thatâs worth it.
So I donât think this is worth the complication.
The float ones, on the other hand, I think have some real merit, as there is no literal way to express those â but to waffle, itâs also common to need the math module anyway if you are using those (or numpy).
These collections of little marks are often difficult to read and/or parse for some users. Also, think about how screen readers might handle them. Depending on the font, etc. how easy is it to visually distinguish '' from ' '? These are issues I consider when I prefer symbolic names to literals such as this.
Point of note: In C++, there is a very important difference between using a literal and using a named token when it comes to the standard streams:
std::cout << "Hello, " << name << "\n";
std::cout << "Hello, " << name << std::endl;
The std::endl token ends the line, and also flushes the buffer. With tokens like str.NL or str.SPACE, I would be wondering if they have additional meaning, too - for instance, some_string.split(str.SPACE) could conceivably mean âsplit on any whitespaceâ rather than being exactly equivalent to ' ' . So IMO this can impair readability compared to the literal. Thereâs no risk of mistyping it as there is with math.PI (would anyone spot the bug if you wrote 3.141592653589783 for pi?), and no additional meaning, so itâs just a longhand way of writing a literal - unless you have some need to be able to shadow the name str and change all the literals.