PEP 791: imath --- module for integer-specific mathematics functions

There’s a reason my name isn’t on the PEP :frowning: - I don’t want to sabotage its slim chances by association.

Which is my standard answer for many things too. But it’s a balancing act, and “even Raymond” added itertools.pairwise() in 3.10, Before 3.10, it was just a recipe in the itertools docs. It’s that combination of “frequently requested” and “frequently screwed up” again. Which applies to ceildiv() too. whose C implementation would be the tiniest variant of int floor division.

13 Likes

Replying to multiple comments:

@vstinner is right. Add math.integer for new functions, leave the existing ones in place and alias in math.integer. Experienced developers understand any long lived language is going to have some historical artifacts. The disruption to existing code overweighs being consistent for consistency sake.

A foolish consistency is the hobgoblin of little minds, adored by little statesmen and philosophers and divines. -Ralph Waldo Emerson

The not mentioned great value of having functions in stdlib vs pypi is curation. Anything in stdlib has no doubt been bikesheded to death throughly evaluated by the core developer community and can be assumed to be well done and highly likely to be bug free.

pypi, on the other hand, let’s people like me push packages:

I would never want to belong to any club that would have me as a member. -Groucho Marx

For well known module like requests or numpy with significant fuctionality, that’s fine. For a single math function, finding the best thing in pypi is not simple. Searches will often find the most logical name was a single upload from 2016. The user has to sort through multiple packages, check the github stars on each, worry about malware, do a test install to see if the package actually does what its documenation says it does.

Therefore, if a function:

  • is likely to be used by many people
  • has a logical, stable signature
  • will not require regular updating in the future

it’s of great benefit to the community to be in stdlib.

11 Likes

You can say that about most of the proposals on here that get the just put it on PyPI treatment. What makes these number theory functions [1] sufficiently core functionality enough to be put above all those other proposals? And what makes makes pip/PyPI so much less accessible to number theory users than it is to any other field of engineering that frequently uses Python but isn’t made up of software developers?


  1. most of which, I’ve never even heard of, let alone use frequently ↩︎

2 Likes

That’s a pretty bad solution IMO. To make use of those recipes, you need to know they’re in the docs (your IDE won’t help you) and copy them to your codebase. And if you just copy it, your linter may disagree with the style choices, so you need to fix the style or disable the linter for those functions. And then you might need to exclude it from code coverage, because your code does not hit all the edge cases in the function. Nowadays, the docs at least mention the existence of more-itertools on PyPI, but it would be much better to just make them part of the itertools module.

(As for the imath module, I don’t see the point in adding it.)

4 Likes

Half of them are one-liners. How much grief can a linter cause over one line of already fairly conventionally formatted code? Is deleting the one if statement that you don’t use such a terrible burden?

If you’re copying one function, it’s probably not a problem. Until you actually end up needing that edge case in the future. Copy-pasting is also not made easier by the use of from imports (including an implicit from itertools import *).

1 Like

It’s not a zero-sum game. There’s scant sense in which math functions “compete” with, say, new functions for new compression standards. Each proposal needs to stand or fall on its own merits.

@gerardw gave a fine list of criteria:

In that respect, I’d probably be much more willing than Raymond to add simple functions to itertools. But then I don’t feel compelled to write everything in C, and would be happy to leave many in plain Python. He already did the latter by including them as “recipe” examples in the docs. Why not stick them in a .py file too so end users can access them without repetitive copy/paste in project after project?

In part because Raymond is also an instructor, and includes some “recipes” purely for pedagogical purposes. There are always different goals in play.

Nothing, of course. I’ve never, e.g., used the csv module, but have no objection to including it for those who do want it. Quite the contrary.

2 Likes

Centuries of using. It’s not just another compression format, that eventually will perish.

Regardless of the fate of this PEP — the math module will eventually include some new integer-specific function, probably among proposed.

Have you read what “soft deprecation” term means? I don’t see difference with your proposal, unless you come with more details. Did you plan, for example, leave current docs (I meant literally) for old functions in old place?

With soft deprecation we can point users, that there is a canonical place for such stuff in the stdlib. No disruption was planned so far.

And such functions will end being reinvented again in user packages, as shown above for sympy/mpmath.

1 Like

Side note: if you do this, then please call it intmath. imath looks like a spelling error, or an Apple product.

11 Likes

Seems to me the recipes are meant to be learned from, not blindy copied and pasted.

2 Likes

What matters in the end is what people do, not what the author intended :slightly_smiling_face:.

This is nuanced, and the itertools docs are mostly very clear about this: most recipes are intended to be educational, but they’re also intended to be an “incubator”: candidates for “promotion” to advertised module functions. That process is ongoing. The current docs identify 3 former recipes that are now supported module functions, and 3 more that are being “tested” for promotion. What the docs aren’t clear about is what this testing consists of.

The docs also note that the PyPI more-itertools package bundles nearly all of our docs’ recipes, and more. If the PyPI stats are to be believed, more-itertools is downloaded millions of times per day, so it has quite a large audience.

sieve(n) is one of the recipes up for promotion, to generate the primes less than n. If we had an imath module years ago, I’m sure it would have already been added there. It really has nothing to do with “iterator algebra” (so doesn’t really belong in itertools), and the recipe computes all the primes < n before yielding any.

Sieving is fine, but a “serious” implementation would use less RAM, yield primes as soon as they’re identified, and ideally wouldn’t require the user to specify an upper bound. But an “industrial strength” implementation wouldn’t fit in 6 lines either. So that recipe (IMO) should remain purely pedagogical. Nevertheless, it’s much better (more efficient) than what most people come up with on their own (this has been reinvented countless times, and I’ve seen many dozens of them over the decades).

Ironically, more-itertools bundles an earlier version of sieve(), which is a bit harder to follow, but yields primes (in batches) as it goes along. They all suffer from twisting the problem so that iter_index() can be used in the implementation. “The obvious” way to do this would explicitly iterate in a loop, and yield primes one at a time ASAP. Which is something PyPy can optimize.

2 Likes

So far we had following variants:

  • imath
  • intmath
  • ntheory
  • zmath
  • integermath
  • imaths
  • dmath
  • math.ntheory
  • math.integer
  • math.int
  • math.discrete

Let me know if I miss something or you would like to add something else. I’m planning to start a poll with multiple choices (3 would be enough, or more?:)).

3 Likes

I’m not so sure about it myself, but maybe dmath (as in “discrete math”), or math.discrete, could also be worth considering.

1 Like

“The best” poll would use STAR voting :smiling_cat_with_heart_eyes:. Short of that, approval voting would be best: put in every suggestion ever made, and allow everyone to vote for all they could live with (in Discourse-speak, a poll with “multiple choices” and “max” set to the total number of candidates).

Approval voting aims at uncovering compromises people will make in the service of building a consensus. That’s why the PSF uses it.

For example, it makes little real difference to me whether, e.g., it’s imath or intmath. In a STAR poll I’d give the former 5 stars, and the latter 4 stars. In Approval, I’d approve of both. And more.

4 Likes

math aka the status quo is missing.

8 Likes

Sorry, this doesn’t work on d.p.o, unless I miss something new.

Yeah, probably it does make sense. I see now no good reasons to restrict number of choices.

Sorry, I don’t think it belongs to that list, at least for two reasons:

  1. it’s off-topic: poll is about naming of the new (sub?)module, not about the fate of the PEP.
  2. I don’t think, that polling is a good tool to do technical decisions like accept/reject PEP. We have SC to do this for some good reasons. (Perhaps to accept sometimes unpopular ideas — or reject popular ones).

(Anyway, feel free to start a different poll.)

IMO, this thread is to gather feedback, resolve open issues (hence, the poll), collect arguments for and against the PEP. And so far there are not too much arguments in the second camp: “lets not make things backwards incompatible with very little benefit”. But the PEP does not make things backwards incompatible! — so, this is rather a weak argument, isn’t?

5 Likes

No, no support for STAR here. I could set up a poll on a STAR internet voting service, but don’t think it’s worth the bother - and would be more inconvenient for everyone.

Restricting the number of choices is generally a Bad Idea™ One of the joys of pure (no limits) approval voting is that there’s no such thing a “wasted vote”. People are free (& encouraged) to express what they really believe. With limits, people can be pushed into betraying their true favorites via fear of “wasting a vote” on a favorite they don’t believe can win.

So the results rend to reflect consensus about what people think will win rather than what they really want. For example, I like names with ntheory too, and will approve of them, but won’t if I only get (say) two choices.

Limiting the number of choices has no upsides I know of, only downsides. In the limit, it reduces to plurality (“pick one!”) voting, one of the worst voting systems in existence for contests with more than two contenders.

Pure approval has the potential to show true levels of support. Although it doesn’t in reality - people are so conditioned, by plurality voting, to believe that they “must” try to game the system, and so “cleverly” cast insincere votes (for pure approval, usually via insincerely withholding approval from choices they’re actually fine with - gamers gonna game, but “honesty is the best policy” under approval).

8 Likes

My existing imath module won’t work after you apply this PEP, therefore it makes things backwards incompatible. The argument doesn’t have to be any more complex than that.

The PEP doesn’t show any arguments in the first (“for”) camp, either. Simply waving your hands and saying that it doesn’t break anything isn’t an argument for making changes, it’s actually an argument against making changes (because if they don’t change anything, where’s the benefit?).

At least make a good list of functions to add, so that it seems like we’re getting some value here. Or see my earlier posts for other, actionable suggestions on how to make this a better PEP without actually changing your designs at all. That’s all we’re asking - make the PEP sell the addition, not merely say “we can, therefore we must”. That’s a bit too authoritarian. Go reread the matmul PEP for a great example of doing it well. (Also the secrets module PEP is a good one for “it’s worth adding a module for a set of functions that could theoretically exist elsewhere”.)

5 Likes

This is a name conflict, for a specific name proposal. There are others. It’s not an argument against the PEP itself.

Some name conflicts mentioned in the current PEP text, others will be in the poll. I realize, that open source projects are just a part of the Python ecosystem. There will be name clashes with private and/or pet projects, especially for 5-letter names. But there is always an option for a new top-level stdlib module — to pick another name.

There is the Motivation section, that lists them.

Have you some suggestions on how to fix mentioned documentation issues with the current module layout? It’s not just the preamble, e.g. should every integer-related function specify that it rejects non-integral arguments? Some do, some don’t.

I would like to avoid this (i.e. this shouldn’t be a part of the specification). The PEP will lists some probable candidates, but I think this should be solved separately, on case-by-case basis.

Serhiy did a good job to show how fast this part of the math module grow. No doubts this will happens again.

PS:
Based on the existing issue I would pick ilog2() (and maybe ilog10()) as a reasonable first addition for the new module over current undocumented features of the math.log() and friends in processing integer arguments. (Which should be deprecated, IMO.)

Um… change the documentation? Yes, each function should specify whether it works on floats or not. Yes, it may get repetitive, but that’s the point of reference documentation. Make new sections in the documentation for “functions that work on all numbers” vs “functions that always convert to float to calculate” vs “functions that require integers”. It’s very easy to fix documentation, no PEP required.

Every other successful proposal like this one has included the API of the proposed module. If you don’t want a full list, start with a few categories that are under-served right now, have multiple uses (i.e. are better in the stdlib than a specialised library), and define a policy for the module that sets out when and how to decide when a function belongs in the new module rather than an existing module. It’ll make all those case-by-case decisions much easier if you set up the guidance now, and will make it less obvious that the PEP hasn’t specified a module API.

It’s not in the PEP. That’s the problem. I’m not doubting there is value[1], but I’m 100% certain it’s not in the PEP. Put it in the PEP and my complaint goes away.

Also not in the PEP.

I’m not trying to make things complicated here. I just want to make sure your PEP is as strong as it can be, so that it’s as persuasive as it can be, and so your proposal has its best chance of succeeding.[2] Bear in mind that the SC are not going to read this entire thread (especially once they see Tim start talking about voting :wink: ), and so if it’s not in the PEP, they are going to assume it doesn’t exist.


  1. I’m personally skeptical that it’s worth the churn, compared to a submodule or fixing the docs, but that’s beside the point I’m making here. ↩︎

  2. Which I’ll then argue against on other grounds, since I don’t think it’s worth it overall. But I’ll happily save that for once you’ve got the “for” argument set out clearly ↩︎

16 Likes