Let int accept scientific notation strings, like float does (but with non-negative exponents only). This would allow scientific notation strings (e.g., command-line arguments specifying big integers) to be parsed directly without the loss of precision incurred by converting via float (e.g., int(float('1e23')) != 10**23
).
Iâm +1 on this â I just noticed this (probably not for the first time) when prototyping a significant figures function in another ideas thread.
Yes, itâs not a common use, and maybe you are using the"wrong" type, or the wrong input method â but as ints can be passed to the âeâ formatter, it would be nice to round trip.
And why not allow 10e-1
?
More seriously, this feels it might break consistency between the literals supported by the language (where 1e1
is a float) and those supported by int()
.
It also seems potentially confusing with hex numbers.
Maybe it should require a flag? Or be a different function altogether (maybe in math
)?
Actually, I think â10e-1â would be fine â as long as the value is an integer, though maybe there is no other place where value, rather than the form, of a string number would be a ValueError
e.g 1e4444 doesnât give you a ValueError
, it give a float with the value of inf.
Yeah, this is of greater concern â though Iâm having trouble coming up with any way this inconsistency would lead to actual confusion or incorrect behavior. Though my not thinking of it doesnât mean much âŠ
I donât know that either of these would be worth it â I donât know about the OP, but I think for the most part, you wouldnât know that youâre getting exponential form when you write the code.
hmm â there is already some inconsistency on how literals and int string parsing are interpreted:
In [9]: int('012')
Out[9]: 12
In [10]: 012
Cell In[10], line 1
012
^
SyntaxError: leading zeros in decimal integer literals are not permitted; use an 0o prefix for octal integers
granted, thatâs an Error rather than a different interpretation, but IIRC, it wasnât always an error.
so maybe:
1.2e3
being evaluated as 1200.0
and int('1.2e3')
being evaluated as 1200
wouldnât be any more surprising â and they would have equal values, as long as it was within the float range.
I am -1. It will break a code like:
try:
x = int(s)
except ValueError:
x = float(s)
Note also that 1e23 != 10**23
, so int('1e23')
would not be equal to float('1e23')
.
Surely that doesnât matter? If you want to allow exponential notation, use the new function. If you donât, use int
. What to allow should definitely be your choice.
Of course, once itâs a separate function, we have the debate of why not write your own, why not publish it on PyPI rather than in the stdlib, etc.
Currently, int(s) is int(s, base=10)
and int('1e23', base=n)
is invalid for n < 15 and valid with e == digit 15 for n >= 15. Both should remain true. A new flag sci=True
, mutually exclusive with base
could work.
Well, this is horrible, but works:
>>> int(eval("1e1"))
10
but
>>> int(eval("1e23")) == 10**23
False
anyway.
Except that, no, it doesnât work. Thatâs just int(float(x))
from the OP but in a slower and more risky way.
I agree a flag is a more practical choice to not break any previous existing code.
I really dislike the editing feature. I canât quote the entire message because itâll get removed, and if I quote less than all of it, nobody knows which version I replied to.
But I was actually responding to the post-edited version. You came up with something thatâs exactly as wrong as the original int(float(x))
, but with a new set of problems since it uses eval. So you were half right. It is horrible. It just doesnât work.
########
EDIT2
########
Christopher has a more elegant solution for checking if the new flag is correct: if sci and base > 15: raise
########
EDIT
########
Anyway, maybe this flag is too much specialized. Maybe itâs better a new math function, as suggested by the BDFL.
########
Original post:
########
As I said, I feel the idea as good, but if sci
have to be mutually excusive with base, this means that
int("1e23", base=16, sci=False)
int("1e23", base=10, sci=True)
are both illegal? Is not enough to check if sci and base != 10
?
Is there anything fundamentally wrong with using scientific notation with other bases? weird maybe, but wrong?
OK â base 15 and above use âeâ as a digit, so not good â so yeah, disallow it â I suppose you could check for sci and base < 15
.
NOTE: for those to whom itâs not obvious, two reasons that:
int(float(a_string))
is not a good solution are:
truncation of non-integer values:
In [9]: int(float('1e-3'))
Out[9]: 0
overflow to in inf for very large numbers:
In [10]: int(float('1e500'))
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-10-540b4755c49c> in <module>
----> 1 int(float('1e500'))
OverflowError: cannot convert float infinity to integer
I think the truncation is worse â or at least would be hit more often.
-CHB
[guido] Guido van Rossum https://discuss.python.org/u/guido guido
CPython core developer
February 5
Maybe it should require a flag? Or be a different function altogether
(maybe in |math|)?
I can see the point in having a better way to express very large
integers, but the E-notation is closely tied to floating point numbers
and so people who write 1e23 expect to get a float and not an integer.
As a result, having int(â1e6â) work and int(â1000000.0â) fail would be
inconsistent.
There doesnât appear to be a notation similar to the E-notation for
large integers and inventing one for Python (e.g. â1L23â) would again
confuse people.
So why not simply use a helper function, e.g.
def largeint(x, e):
return x * 10 ** e
>>> largeint(1, 23)
100000000000000000000000
For that matter, why not simply use an expression? It avoids the function call overhead, and as was pointed out earlier, the peephole optimiser even removes the overhead of doing the calculation at runtime where possible:
BIG_LIMIT = 10 ** 23
(Anyone claiming that 10 ** 23
is less readable than 1e23
is drifting very much into subjective opinion territory).
It gets a little harder when youâre not working with a plain power of ten though.
MASS_OF_EARTH = 5_972 * 10 ** 21
MOLE_QUANTITY = 602_214_129 * 10 ** 15
def mole_of_moles():
print("If one small furry animal weighs 75g...")
mole_mass = MOLE_QUANTITY * 75 // 1000
print(mole_mass, "kg of moles")
print(MASS_OF_EARTH, "kg of earth")
print("This planet weighs", MASS_OF_EARTH // mole_mass, "moles of moles.")
⊠if anyone asks, I did not tell you it was ok to write code like this.
How would it be easier with some sort of 5972e21
notation, though? Are you assuming that 5.972e24
should be treated as an int?
No, Iâm not, because that includes a decimal point. (I suppose you could argue that, if the exponent exceeds the number of digits after the decimal, it could be stored as an int, but Iâm not proposing that.) But if it were written as 5_972e21
then perhaps yes, it could be stored as an int.
So whatâs âa little harderâ with the version using 5_972 * 10 ** 21
then? I feel like Iâm missing your point/