Backquotes for deferred expression

Nineteendo · November 14, 2024, 9:30pm

Isn’t that because you have multiple arguments? If you have 0 arguments, the value can simply be retrieved by calling it. Assuming the function doesn’t have side effects.

dg-pb · November 14, 2024, 9:47pm

That is not all.

Say:

a = lambda: sum(1)

How do you know what sum it is calling?

So either:
a) namespace needs to be constructed by analysing code and picking up needed variables (complex and requires maintenance)
b) sourcing whole globals() / locals() (heavy on memory)
c) manually constructing one

Some libraries do that, but I doubt it will be decided to go this way in CPython any time soon.

Neither I think it would be a very good idea.
It doesn’t completely make sense conceptually.

E.g. Imagine this was done:

var = 1
a = lambda: var

# Route 1
b = pickle.loads(pickle.dumps(a))
var = 2
print(a())    # 2
print(b())    # 1

So it does not preserve intended behaviour.
Thus, ad-hoc lambda/def is simply incompatible concept.
And IMO, things like this should be best left alone.

Thus, I use partial for that, which obviously needs serialisable callable (same issue if callable was ad-hoc lambda), but that makes sense.

dg-pb · November 15, 2024, 4:39pm

A question.

Do you inevitably inend to wrap lambda/def or are you thinking of optimizations even beyond that?

zhangyx · November 15, 2024, 4:52pm

In the latest version of the demo, you can already make a deferexpr serializable. This is because DeferExprExposed.callable now accepts any python object. Feel free to give it a partial, or any other callable object.

The reason I haven’t supported DeferExpr(fn, *args, **kw) is simply because I would need to import functools.partial from within cpython core. I don’t know if it’s safe to assume stdlib.functools will always be available even for some weird builds.

You can even try to trick it by giving it a non-callable object

Python 3.14.0a1+ (heads/feat/defer-expr-dirty:3a731f23e1, Nov 15 2024, 00:23:55) [Clang 16.0.0 (clang-1600.0.26.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> x => a + b
>>> expose(x).callable = None
>>> x
Traceback (most recent call last):
  File "<python-input-2>", line 1, in <module>
    x
RuntimeError: Failed to observe DeferExpr: NoneType is not callable

zhangyx · November 15, 2024, 4:55pm

To answer your question: No. I gave up on possibilities for minor optimization in favor of flexibility and ease of use

dg-pb · November 15, 2024, 5:56pm

This is not what I meant - I don’t think I expressed it well. You would not need to import partial. If you gave it func,args,kwds, on observation it would simply do func(*args, **kwds) instead of simply func.

But these are details, that can be addressed later. Still not sure if args/kwds are necessarily needed, that depends on a lot of other stuff that I am not yet sure about.

API of expose seems a bit too complex. Also, it might be better to leave DeferExpr immutable. Mutability would add performance in some edge cases, but would make things much harder going froward.

Various functional requirements can be addressed by instantiating new object as opposed to modifying existing one.

zhangyx · November 15, 2024, 6:07pm

My plan was to check the number of arguments passed to DeferExpr.tp_new(). For single positional argument (i.e. nargs==1 and kwargs is empty), we directly store it as “callable”. Otherwise, internally construct a functools.partial object using supplied args and kwargs and store the resulting object as “callable”.

Agreed. For now I’ll keep the demo as is but I will describe it as immutable in final proposal.

dg-pb · November 15, 2024, 6:11pm

Storing partial would be inefficient as observing would be unnecessarily slow. It would result in extra function call and also partial has more complexity within than is needed here.

If user wants partial, he can just source it manually.
Also, your argument of functools being dependency of this is very valid. If it can be independent, it definitely should be.

As I said, I am still not sure if allowing args/kwds is necessary, but if it is, simplified logic can be taken from partial.tp_call to be used in observation

zhangyx · November 15, 2024, 6:14pm

Actually this doable and will not induce too much overhead. We can just short circuit the logic and allow them to be NULL when they are both empty.

I can modify the code really quick and make it work this way. You can expect a commit for this in next few days.

dg-pb · November 15, 2024, 6:18pm

From performance perspective this is pretty much for free.

See partial.tp_call for some ideas - it has fast paths for various cases, such as no-arg-call, 1-arg-call, etc. At least for vectorcall.

dg-pb · November 15, 2024, 7:02pm

Leaving syntactic conveniences aside, my only doubt now is the signature of DeferExpr (apart from its name ).

So ideally it should have functionality to cache observed value and not re-evaluate, so it either needs args/kwds to be simple arguments as opposed to varargs/varkwds or separate class for cached version.

Option 1.

class DeferExpr:
    def __init__(self, func, args=NULL, kwds=NULL, cached=False)

Option 2.

class DeferExpr:
    def __init__(self, func, *args, **kwds):

class CachedDeferExpr(DeferExpr):
    def __init__(self, func, *args, **kwds):

There is one more piece of functionality that might be required in some rarer cases - replacing value in the namespace.
But I don’t think this needs to be overcomplicated for this as this should not be a recommended practice in the first place and if someone really needs it, it can be done with a simple wrapper:

def lazy_import(name, namespace=None):
    def import_func():
        mod = importlib.import_module(name)
        if namespace is not None:
            namespace[name] = mod
        return mod
    return CachedDeferExpr(import_func)

dg-pb · November 15, 2024, 7:10pm

Or more generic version of the wrapper above:

def lazy_namespace_replace(func, namespace):
    def outer(*args, **kwds):
        def inner():
            namespace[name] = value = func(*args, **kwds)
            return value
        return CachedDeferExpr(inner)
    return outer

import_lazy = lazy_namespace_replace(import_module, globals())
numpy = import_lazy('numpy')

Having that said, I would probably just use CachedDeferExpr as the overhead of acting on cached proxy object instead of actual object should be pretty small.

numpy = CachedDeferExpr(import_module, 'numpy')

zhangyx · November 15, 2024, 7:20pm

For now, I do not have an answer about how to simplify the syntax to declare a collapsible DeferExpr v.s. a non-collapsible one.

The current demo is functionally compatible with both. Collapsible will be set to False by default. It requires one additional line of code to configure its behavior.

x => 1 # Or x = DeferExpr(lambda: 1)
expose(x).collapsible = True

I admit this does not look good enough.

(BTW cached indeed looks better, I will rename it in the demo)

zhangyx · November 15, 2024, 7:26pm

Will it help us to sell the proposal if the following syntax is supported? Not sure if it’s feasible.

numpy => import numpy

or

defer import numpy

It “sort of” tackles the lazy import syntax in an unexpected way. AFAIK this is also a highly demanded feature.

dg-pb · November 15, 2024, 9:16pm

I think this can be done 2 parts.

Design of DeferExpr
Syntactic convenience for it

This is because syntax changes are difficult. Once you make actual proposal, there will be a lot of people not happy with the syntax. Many will not like =>, others will have their own “better” ideas, such as lazy, defer keywords, etc… In short, it will be a substantial amount of pain to push it through, this is why I would suggest splitting it into parts.

Also, by the time DeferExpr is worked out and implemented, there will be new information available that might help clarify syntax decisions.

Also, DeferExpr can be used without syntactic convenience, so it can be designed as DeferExpr(func, *args, **kwds) and if syntactic convenience is not making use of some functionality, it is not a big deal, user can just use DeferExpr explicitly if he needs it.

Me neither. One thing that I have observed is that => syntax would most likely encounter quite a lot of resistance. lazy or defer soft keywords seems to be preferable by community (given sentiment of past discussions). Personally, given current information that I have, I would vote for lazy - soft keyword would slide better than operator syntax and it is shorter and more catchy than defer.

Also, soft keyword has more opportunities for customization. e.g.:

a = lazy: 1 + 1
b = lazy c: 1 + 1    # cached flag
c = lazy cd: 1 + 1   # one more flag

truetype(a)    # DeferExpr
truetype(b)    # CachedDeferExpr

Could think of flags in the same way as flags in re.

Also, back-dooring ideally would be done at least with some considerations of more general protocol.

E.g. maybe in the future someone wants different type of proxy object. Which also perfectly emulates some underlying value, but has a different purpose.

So it would be good if same backdoor mechanism was easy to adapt if needed.

My initial thoughts are that it could be simplified to 2 functions: truetype and proxyattrs

a = DeferExpr(lambda: 1)
type(a)      # int
truetype(a)  # DeferExpr
# If `isinstance` is needed for subclasses of DeferExpr:
if issubclass(truetype(a), DeferExpr):
    ...

def proxyattrs(obj: ProxyObject) -> tuple:
    pass

attrs = proxyattrs(DeferExpr(lambda: 1))
print(attrs)
# (func, args, kwds)

Mutability should not be encouraged, but if it is needed very badly, it can be done by modifying objects returned from proxyattrs.

In this case, kwds being a dictionary will be mutable (same as in case of partial).

zhangyx · November 15, 2024, 9:27pm

I am afraid that this looks too similar to lambda function’s argument list, it might cause some confusions.

Overall I agree that it might be better to drop new syntaxes from this proposal. Soft keyword would be a better alternative if we have to provide a syntax along with it.

dg-pb · November 15, 2024, 9:31pm

DeferExpr is pretty much the same concept as lazyobject in lazyasd, except it would fill gaps that are impossible to fill without messing with python internals. Also, having C implementation would be appropriately performant - pure python nested access is a bit expensive.

And I have been using lazyasd for lazy imports for quite a while now. To be more precise, my own object that I have made investigating lazyasd.

Nineteendo · November 15, 2024, 9:38pm

Maybe you could borrow the syntax for type aliases? (they’re also lazy) And use a separate keyword for a cached deferred expression?

lazy   a = 1 + 1
cached a = 1 + 1 # might not be very clear

zhangyx · November 15, 2024, 9:44pm

One problem I found with => operator is that you cannot make in-place declaration in a function call:

x => a + b
add_one(x)
# The following will never be accepted:
add_one(=> a + b)

If we have to find a syntax for it, we need to make it an expression, not a statement.

P.S. Another arguable benefit of making it an expression is we can have out-of-the box support for type annotations.

zhangyx · November 15, 2024, 10:03pm

I guess we can find it a home inside inspect module? e.g. inspect.reveal()?