It would be nice if the `next()` function had a `count` argument

Sure, I can provide further details but it’s pointless.

  • There are 5 types of row
  • 3 of them are of type ‘T’, ‘F’ or ‘C’
  • One of the tasks is to find all sequences of 3 rows which are a ‘T’ followed by a ‘F’ and then a ‘C’
  • For such rows, there are approx 10 other columns where we need to check for consistency, meaning the values should match
  • Except for the ‘T’ row, where not all data is present so not all of the columns are a match
  • Additionally, some columns are never a match
  • For one column, there are three values ‘A’, ‘B’ and ‘N’. For the ‘T’ row, the value should be ‘A’ if ‘F’ and ‘C’ rows are ‘B’, it should be ‘B’ if ‘F’ and ‘C’ rows are ‘A’, and if ‘T’ is ‘N’ then there’s some other special logic (I forget exactly what) which has to be run

Hopefully it is now obvious why I didn’t provide these details. It’s just a set of arbitary, complex but not complicated, rules.

This data comes out of the back of someone else’s order matching engine, so it’s really just a log of operations which needs special filtering logic to extract important features.

It is possible to process this data in a vectorized way. I implemented that first. It’s faster, but I abandoned it because it’s hard to debug, and hard to understand if it’s doing the correct thing. It also requires creating literally dozens of pseudo columns to hold intermediate results which results in problems trying to name all of those new columns. The names are important for implementing algorithms to operate on the columns in a systematic way. This kind of thing is also hard to write tests for, because the number of possible combinations is huge.

I must admit - in terms of testing, this is a weird situation. I essentially have the output from some system. I am processing this data to get the input. If I feed the input into a similar system, I should get the same output. That’s my testing.

We are firmly in “help” territory now. Moved from “Ideas”.

2 Likes

No it isn’t. At no point did I ask for help with this particular problem, which I have already solved.

Edit: posted this as reply to my previous comment, but it was meant to be a standalone comment.

To restore the momentum of the idea, the complications that IMO are significant include how the new parameter would interact with existing ones, and how would it possibly slow down the overall operation of next.

As additional support for the idea, clearly next doesn’t have a strict single responsibility of calling iterator.__next__— it may go on and return default. In this way, there is precedent to drop in tuning parameters.

Python Help is really also the general discussion category. Ideas is basically intended for potential PEPs and other changes, and at this point this thread is not going in that direction. In a sense you did ask for help for your problem: you asked the core devs to add something to Python that would help you[1].

The naming and intent of the discussion categories has been difficult to communicate and there have been various discussions about that problem.


  1. a solution which would solve your problem in a couple years ↩︎

1 Like