Could I put in a word for spaces as a grouping option in the format specification mini-language? Right now, comma and underscore are available – why not space as well? One doesn’t always want to change the locale to get the international standard for representation of numbers.
A space already has a meaning: prefix with a space if the number is zero or positive. If a number is negative there’ll be a minus sign, but, normally, nothing if it’s zero or positive:
>>> f'{0}'
'0'
>>> f'{1}'
'1'
>>> f'{-1}'
'-1'
but:
>>> f'{0: }'
' 0'
>>> f'{1: }'
' 1'
>>> f'{-1: }'
'-1'
Formatting numbers with a space as the thousands separator doesn’t require that the formatting code must be a literal ' '
character. Although I’m not sure what it could be, as a lot of characters are already taken by other codes.
You can use replace:
>>> f"{123_456:_}".replace("_", " ")
'123 456'
Quite true. Thanks for the reminder.
- Numbers consisting of long sequences of digits can be made more readable by separating them into groups, preferably groups of three, separated by a small space. For this reason, ISO 31-0 specifies that such groups of digits should never be separated by a comma or point, as these are reserved for use as the decimal sign.
For example, one million (1000000) may be written as 1 000 000.
Then we likely need Unicode U+202F = NARROW NO-BREAK SPACE (NNBSP) as separator
As programming languages generally don’t like spaces in numbers, an underscore is a good compromise.
?? They typically don’t like commas either, but this is about displaying the numbers, not typing them.
I developed a pypi package called sciform to address this and other scientific formatting requirements. Separators for upper digits (digits in decimal places above the decimal symbol), separators for the lower digits (digits in decimal places below the decimal symbol) and the decimal separator can be controlled using the upper_separator
, lower_separator
and decimal_separator
options respectively. Here’s code demonstrating one way to access these options:
from sciform import GlobalOptionsContext, SciNum
num = SciNum(123456.654321)
print("Only decimal separator:")
print(f"{num}")
print("Space separators:")
with GlobalOptionsContext(upper_separator=" ", lower_separator=" "):
print(f"{num}")
print("Point decimal separator:")
with GlobalOptionsContext(upper_separator=",", lower_separator="_"):
print(f"{num}")
print("Comma decimal separator:")
with GlobalOptionsContext(upper_separator=".", decimal_separator=",", lower_separator=" "):
print(f"{num}")
gives results:
>>> Only decimal separator:
>>> 123456.654321
>>> Space separators:
>>> 123 456.654 321
>>> Point decimal separator:
>>> 123,456.654_321
>>> Point decimal separator:
>>> 123.456,654 321
The SciNum
object exposes a format specification mini-language (FSML) which is loosely an extension of the python built-in FSML*. In previous versions of sciform
I considered doing as you suggest and including control for the various separators into the FSML. However, there got to be way to many options and the FSML had to become very complex to ensure an unambiguous reading of all specification strings (see the comments above about how the space character " "
already plays a role in the FSML. In the end my conclusion was that the FSML was already very complicated and it wouldn’t be worth it to further complicate the FSML with these options. Instead, it would be better to configure these options at some higher level of configuration the the FSML.
Here’s a github issue where I discussed this in some further detail.
sciform
was in part motivated by the discussion here on Python discourse: New format specifiers for string formatting of floats with SI and IEC prefixes. Near the end of that thread the conversation was moving in the direction: “This is a good idea for a pypi package, it doesn’t need to be built into python.” After having spent time thinking about this, I think that is how I lean on this idea as well.**
*It is not strictly an extension since there are some features in the python FSML which are deliberately not available in the sciform
FSML.
**I do want to emphasize that I still think some of the suggestions in the other thread (most importantly better controlled sig fig rounding) would be worth including in python. Just trying to say that this specific feature would make the FSML too hairy for what is gained.