Syntax for Generator iterables using angle brackets

I don’t know if this has been discussed before (could not find anything), but I was wondering whether angle brackets (<>) could be used for defining a generator.

Would this syntax would be completely anti-Pythonic?

queries = <"abc", expensive_func(), "something else">

My current use-case and workaround is something like this (not pretty, but it works):

queries: Iterator[str] = (lambda: (
    (yield "abc"),
    (yield t if (t := expensive_func()) else None),
    (yield "something else"),
))()

Have a lovely day folks.

1 Like

For completeness, this is how it is currently done:

def queries() -> Iterator[str]:
    yield "abc"
    t = expensive_func()
    if t:
        yield t
    yield "something else"
5 Likes

I suspect it’s not possible because =< already means “lower than equal”

It would be nasty imo if a space between two symbols changed the meaning of a statement.

And honestly I like the proper solution in your second post. Very readable. Not too long. Not much boilerplate.

2 Likes

@elis.byberi Yes, thanks - nice to include that.

@petercordia You’re probably right. And do you prefer my lambda or Elis’ def queries()? :sweat_smile:

1 Like

No I actually meant @elis.byberi 's solution. Guess I wasn’t very clear.

You still have to call the decorator function, of course.

def queries():
    ...

queries = queries()

or using a decorator hack:

@lambda x: x()  # Or operator.call
def queries():
    ...
2 Likes

<= is less-than-or-equal. =< is two tokens with or without the space, which currently produces a syntax error.

3 Likes

This would make the following expression ambiguous:

a<<b>>c

Is it:

a < <b> > c

Or:

a << b >> c
1 Like

I think it would still be a, <<, b, >>, c, using greedy tokenization just like 3.__add__ is a syntax error involving the tokens 3. and __add__, rather than a bound method involving the integer 3.

But it’s one thing to have ambiguous-looking constructs in the language, and another to intentionally introduce more.

3 Likes

It seems like the angle brackets syntax is out of the question due to its ambiguity.

Maybe the question should be whether there is a need to be able to define such Generator iterable (without using function definitions)?

I don’t think you’ve presented a sufficiently compelling argument in favour yet. The use case is very limited, and the existing approaches (use a list if nothing’s expensive to compute, or a generator function if they are) seem sufficient to me.

5 Likes

Martin,

Others have pointed out that although your idea for a useful iterator is decent, using angle brackets is not a good alternative. Someday, Python may decide to open up and allow UNICODE symbols that are not in ASCII and we can choose from oodles of paired bracket-like things to the point where nobody can read our code.

But I suggest what you are asking for may already be in some collection such as itertools or easily be added if it makes sense into a module that can be used by anyone.

First, let me make sure of the requirements without writing any python code.

It sounds like you want to take a list that can contain any combination of objects, or perhaps only text strings, interspersed with variables that contain a function that takes no arguments. NOTE you may have made a mistake in attaching a pair of parentheses here:

queries = <"abc", expensive_func(), "something else">

You are asking python to evaluate expensive_func() before the iteration starts. Iterators are supposed to delay all kinds of evaluation until that item is reached, and can be abandoned before then. Just provide the name of the function with no following parentheses.

So here is a concept of a family of simple generators you can create. Some can be done as simply as putting an expression in parentheses that makes them into a generator expression, but perhaps creating an actual function containing a yield statement works better and simpler.

All the scenarios I mention below have much in common. A function that takes a first argument that is a list. Some may take second or additional arguments.

The simplest scenario has the iterator function loop on the contests of the list and check what the type of the current object is. If it is a function, invoke it and return whatever it returns. For anything else, just return it.

One variant is to skip items that return something like a NULL. Your variant seems to filter out anything that is not truthy and will ignore a function that returns False, or “” or perhaps 0. I am not sure this was your intent. And you replace it with a NULL.

If you want to be a bit more careful, you can allow optional arguments to tailor your need like providing a DEFAULT to return or even providing a function name that takes the result of the first function and decides whether it gave an appropriate result or not.

Yet other functionalities can be considered such as whether to do something recursive if your function itself returns a list.

There are many such changes and scenarios including the ability to call functions that take additional arguments and having your iterator take an argument or two that it passes along.

Once you have one or more such generator functions, how hard would it be to use them like this:

todo = list("this", func1, "that", func2, "done", True)
my_list_plus_func_generator(todo, default=NULL)

This needs no brackets and can be easily enough to read if you choose names well.

Just a thought. I would not be shocked if something along these lines has been done in a place like itertools.

Avi

Except list doesn’t take an arbitrary number of list elements as arguments. You need to pass an iterable.

1 Like

If you find yourself needing to regularly create complicated generator expressions, if may be worth considering whether designing a special purpose iterable/iterator class would make sense. E.g. creating a LazyConcatonatedQuery type that manages lazily processing and joining the queries, which you can reuse in multiple places with less boilerplate.

Also worth keeping in mind that generator expressions aren’t the only way to create custom iterators on demand. You can also build complex iterators by combining itertools functions, the builtins map, filter, etc.

Depending on your actual use case and what sort of lazy evaluation behavior you need, something like one of the following constructs might work as a good alternative:

queries = itertools.chain(
    ["abc"],
    filter(None, map(operator.call, [expensive_func])),
    ["something else"],
)

or

queries = [lambda: "abc", expensive_func, lambda: "something else"]
queries = map(operator.call, queries)
queries = filter(None, queries)
1 Like

Hi Clint,

Could you please elaborate on what I considered just a back of the envelope example that would be polished if actually being built.

Yes, I used an example of placing the things wanted done in a list albeit a more general function would accept many more things including similar ones like tuples, and perhaps dictionaries and sets but also generators of many kinds.

If I understood the main concern of the op was to avoid what he considered expensive function calls until and if needed. Having the name of a function in a list or other construct in a way that does not invoke it seemed to be the point. And, as Brian pointed out, many times you can simply cobble together other tools, as in itertools that get the job done.

What I am not clear on is what you mean about a list not taking arbitrary numbers of arguments. As far as I know, the specific list I was talking about was what happens if you want to design a tool that takes something list-like and returns one item at a time, including processing some differently and returning results.

A second issue I mentioned on the side was that the OP seemed to want functions with no arguments. I was not sure how useful such functions would be. I wondered if what might be wanted was a way to pass in additional info to the main function as in additional function arguments that would be passed into each and every function call on the list.

An example might be if you had a collection of numbers as in a list or numpy array and, if needed, wanted to produce an assortment of statistics. One design related to what I wrote might look like this, assuming you had a number of functions that all too the same argument. In the example below, I want to describe some data, such as a silly example of odd numbers below 100. I have various descriptive statistics interspersed with text that I want to produce on demand, perhaps by calling the iterator twice for each new item till you give up and leave the rest alone!

items_to_calculate_about = list(range(1,100,2)) # Don't ask why
list_of_things = [ 
  "Statistics:", 
  "count:", len,
  "maximum:", max,
  "minimum", min
  # ...
  ]
  
result <- my_generator(list_of_things, items_to_calculate_about)

In my example, there is a single common argument for all the functions called. Now, obviously, it could be expanded in many ways to handle multiple arguments or to take one list argument it presents to the function as multiple arguments and so on. There are lots of variations and I do not want to build any of them. I am just discussing the ideas of the OP and suggesting lots of interesting tools you can build without needing a new kind of bracket.

I think you misunderstood the OP. What the OP asks for is a language construct that, like a lambda or a generator expression, does not evaluate any of the comma-separated expressions when the construct is defined, but only when it’s an expression’s turn to be yielded.

1 Like

Right, I was more talking about ambiguation for a human parser.

By the way, taking the ambiguation of the proposed syntax to the next level:

a<<<b>>>c

Is it:

a << <b> >> c

Or:

a < (< <b> >) > c # a generator that yields a generator that yields b

One possible workaround to disambiguate the syntax is to make the opening and closing markers 2-character tokens instead, e.g. (< and >), such that there is no possible overlap with the existing syntax:

queries = (<"abc", expensive_func(), "something else">)

I agree with @pf_moore that the OP should provide more compelling real-world use cases to show that this pattern is common enough to be worth a syntax change though, since I personally don’t have one.

If the OP can find such a pattern in the existing code base of reputable projects and show us how the code can be made more readable by adopting the new syntax, that would be a good start.

The most widely used iterables (lists, tuples, sets, dictionaries, strings) have alternative constructor syntax while generators do not. When instantiated alternatively, they are not iterators:

>>> a = [1,]
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'list' object is not an iterator
>>> 
>>> a = (1,)
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object is not an iterator
>>> 
>>> a = {1,}
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object is not an iterator
>>> 
>>> a = {1:2}
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'dict' object is not an iterator
>>> 
>>> a = "1"
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object is not an iterator

We expect this, and we let iter or a loop construct deal with their __iter__ implementation. Whereas a generator type is treated as an iterator:

>>> # a generator
>>> def a():
...     for i in range(1):
...             yield i
... 
>>> type(a)
<class 'function'>
>>> 
>>> b = a()
>>> type(b)
<class 'generator'>
>>> next(b)
0

This is unintuitive because, in the definition linked above, a generator is a function that returns a generator iterator.

This observation is shared in support of the idea

Thank you for explaining Ben. I seem to have assumed a much simpler request and assuming your interpretation is correct, this is a much deeper issue.

I apologize for the length of some of my points as I grapple with interesting ideas.

I have done lots of programming in another interpreted language called R and have noted many differences in the approach that language chose versus Python. In R, for comparison, just about everything is deferred by default. It is a very different mindset that can both be very useful and also produce some rather bizarre results.

As an example, if I write a function in R, the arguments to the function are generally not evaluated before calling the function. But, as soon as the code in the body of the function uses a variable, with exceptions, it gets evaluated quietly. But you can easily write code near the top of the function that can gently evaluate arguments without triggering an evaluation and store such information for later use, including after it has been evaluated. As an example, if I do a regression with a formula, or give commands to generate a graph, the actual text I typed is captured and can be included in the output. Many Python equivalents I have seen will need a workaround where an additional argument in text has to be supplied that may be the same as another argument, so it can thus be used.

But as Python was not built with the same philosophy, adding related functionality now might not be trivial, or perhaps not very wanted.

Other languages sometimes have special functions that mark things in a way that says they should not be interpreted, at least at the time. Things like quote(…) or I(…) may be used. In theory, Python could adopt many methods and I have not studied it but suspect some are already available.

Obviously a trivial, but perhaps dangerous, method is to provide some code as text or some symbolic notation or perhaps embedded in some object with a protocol that every time the object is evaluated, it decrements an internal counter and returns a diminished copy of itself until the counter hits zero and the next reference does something or causes an eval of something.

Clearly, at the language level, one method would be along the lines of what the OP proposed. If a bracket was truly special and not currently part of normal Python code, it could theoretically behave in some new way when the interpret came along.

Consider something like:

【… 】

or

〘 … 〙

But once you open that door, it may get a bit too interesting.

Part of what misled me was the proposed method called a function with no argument and that is not really a very interesting case and often not something expected to do much processing. Had the example been more like:

queries = <"abc", expensive_func(arg1, arg2), "something else">

Then I might have seen the obvious as that would be far from trivial for other current methods to imitate. Presumably you want everything not evaluated yet including function arguments.

But any changes would need serious consideration to shoehorn into an existing language. What kind of environment will it be evaluated in compared to where it and parts of it have been defined? I have seen some rather convoluted code in R that plays with nested environments so a function makes changes in something like the parental environment or searches for variables in various environments. Do you need to implement closures holding some variables even when you are not sure which?

In my opinion, like any other language, python was built with a relatively few ideas in mind and started fairly simple if powerful, and then people keep improving it to the point where few people use some features and are often puzzled if they read code containing it. Obviously the people who decide what to add or change look carefully at potential consequences.

So, are there modules, such as those doing symbolic manipulations, that can help in a situation like the one being asked about?

You’re spot on. My use-case was an iterable with queries to be emptied until one of them was sufficient.

Here is a slightly simplified real world example, in which the fetch_name() is an expensive and slow (rate-limited) function:

queries: Iterator[str] = (lambda: (
    (yield f"{ticker} site:yahoo.com"),
    (yield f"{n} site:yahoo.com" if (n := fetch_name(ticker) else None),
    (yield f"{symbol} {country_name} site:yahoo.com"),
))()

for query in filter(None, queries):
    # code for fetching query and parsing results
    # return if sufficient results
    ...

But it seems that a complete syntax change is quite a farfetched proposal, and maybe rightly so.