Deferred Evaluation Initial Proof of Concept

Hi all,

Maybe some of you remember that I became largely interested in this idea some time ago. Maybe not.

So I took some time to think about it and did a bit of prototyping. Then I hesitated to put this out, but I am battling with lazy imports again, so I thought maybe it is time.

I still hold an opinion, that if done right, this one has a potential to have a significant and positive impact.

I see it as a feature which would provide a big benefit in one area and fairly many small ones.

Regards,
dgpb

1 Like

Just to be clear… if I understand you correctly, despite starting with a link to an article that opens with a discussion of late-bound argument defaults, this has nothing to do with that?

If you want to give meaning to backticks, you’re going to have an uphill battle. They previously had meaning, and a completely different meaning from this; and they were removed for a reason.

Namespaces and name references are crucial here. You mention two options, but there are actually three:

  1. Use name lookups at the point of instantiation, but take values at the time of evaluation
  2. Capture values at the time of instantiation and retain them
  3. Reference names at point of evaluation.

What is your intention? Be sure test all of your examples such that all three of these will behave differently. Particularly, since you claim that implementation is not the main issue, show how you can achieve your preferred semantics using the string-based implementation you’re demonstrating.

“Why not just use lambdas?” needs a LOT more explanation. I have no idea what you mean by it being “difficult to determine which lambda is lazy and which is not”. All lambda functiosn have the same semantics.

1 Like

I’m not sure if there’s a specific proposal in that document. I struggled to find it if there is - there’s a lot of discussion of multiple options but I didn’t see one specific one being proposed. Also, the examples were almost unreadable to me, but I can’t tell if that’s because your L('string') prototype syntax obscured the point to the extent that I couldn’t imagine what the real proposal would look like.

As this stands, it feels like you’re just going to trigger a new round of the same discussion we had previously, with nothing particularly new added. If your proof of concept feels production-ready to you, I’d recommend getting it published on PyPI, with full standalone documentation, such that people can actually experiment with it. But if you think you need syntax to make the proposal usable, you need to specify the language change properly - again so that people can reason through examples without hitting unspecified or unclear behaviour.

Personally, I don’t have the appetite for another round of speculation or “design by mailing list”. I’d be willing to try out an actual implementation (either for real if it’s available on PyPI, or on paper if it’s a spec for a syntax change), but what you’ve posted isn’t enough for me to do that.

2 Likes

No, this has nothing to do with late-bound defaults. It is different in concept and would not be able to do what late-bound defaults are proposing.

In that discussion, there was a bit of a battle between those 2. But by now I am pretty confident that these are best to be left orthogonal.

Namespaces & references.
I am sure there are even more than that. Same as there are more options in all of the other considerations. But I just concentrated on those that seemed to me sensible.

Regarding your 1. option. Yes it is possible, but to me this resembles a foot rocket launcher.

I am not diminishing implementation part. It would be a challenge and a fairly big one - I am sure. But from what I gathered, the discussion prior to it could be just a bit more challenging.

Lambdas. It is regarding automatic detection whether argument is lazy or not.
E.g.

def f(arg):
    if arg is lazy:
        return arg.__compute__()
   else:
      return arg

What do you do if argument is actually lambda? So there needs to be a way to have a lazy lambda. Essentially, a specialised lazy object is needed, so that everything else is not lazy.

That is the thing, there is no specific proposal. For now I have just narrowed things down and implemented couple of simple variations.

For it to be useful for what I need these features most, it needs to be implemented in low level language. That is why I don’t want to invest too much time into python version.

But I hear you, I will publish something usable in due time if this goes the right way.

I think if there was anything resembling consensus on these options, then I could narrow things down to 1 path and then publish something on PyPI.

I am of the believe that “implicit evaluation” will never work because it can’t be specified enough. You didn’t even try to do this, so you made no progress on that front. But if there is no implicit evaluation, all this proposal is is a slightly different syntax for lambda [1], like I proposed before and then the question is what do you really gain from this?

I think waiting for consensuses is useless. There will never be one, especially not without people being able to try it out.


  1. minus name binding changes ↩︎

2 Likes

Every example in that text is done twice: once with implicit evaluation version and once with implicit propagation. Both are functional.

The question is now: “Given/if all variations are properly achievable, would it be nice/worthwhile the effort to have?” If yes, then “what sort of version would you like to have?”

I don’t want to continue this discussion. We appear to have so different definitions of “specification”, “feasible”/“achievable”, “implicit”/“explicit” that I think any discussion would be pointless.

2 Likes

Closures are foot rocket launchers??

Simple handgun then.

I like everything containerised with well defined input/output structures.

Let me re-state. This option doesn’t align with the idea that I had in mind (picked up), which is: fully pre-define execution and evaluate later. If there was a need to modify the payload, I would say it should be strictly explicit.

Please demonstrate this behaviour in your examples.

In [32]: a = 1
In [33]: la1 = ace.L('a + 1')
In [34]: a += 2
In [35]: la1
Out[35]: 2

However, you are making a good point. It does work this way for say int. However the value can still be modified after lazy definition:

In [2]: a = dict(a=1)
In [3]: la1 = ace.L('a | dict(b=2)')
In [4]: a['c'] = 3
In [5]: la1
Out[5]: {'a': 1, 'c': 3, 'b': 2}

I guess it is inevitable without deep copy as everywhere, but worth noting.

You’re not making your proposal look good by showing yourself to be unaware of fundamental semantics of Python… this is the basics of mutable objects. So, what are your intended semantics? Are you attempting to capture names, values, or something else?

1 Like

I like the idea of Path 1, which I believe is actually quite feasible.

If I understand it correctly, it can be implemented as:

  1. Create a new built-in type deferred as a subclass of str.
  2. Any expression in backticks is stored in the code object as a const deferred object.
  3. The bytecode LOAD_CONST would load the const deferred object onto the stack.
  4. All the other LOAD_* bytecodes, such as LOAD_FAST, LOAD_DEREF and LOAD_GLOBAL, etc., would have to perform an additional type check of the value loaded, so that if it’s a deferred object, the interpreter will evaluate the string as if eval is called with the string in the current scope, and load the value returned by eval onto the stack instead.

With this implementation, a deferred object can be transparently passed to all existing code base with no modifications needed.

Note that it isn’t feasible to implement any method such as __compute__ and __graph__ for the deferred object since the whole idea is to evaluate a deferred object when it appears anywhere outside of backticks so there will be no deferred object by the time the . operation is performed.

Also note that we can adopt a new string prefix such as d and call it a d-string, i.e. d"a + 1", d'a + 1', d"""a + 1""", etc., if people don’t like the idea of reusing Python 2-era backticks.

FWIW, the related discussion:

Thus, I propose one of the following as the new use for the backtick (`):

You’re missing one of the main reasons for removing the backtick
syntax in the first place: the character itself causes trouble by
looking too much like a regular quote (depending on your font), is
routinely mangled by typesetting software (as every Python book author
can testify), and requires a four-finger chord on Swiss keyboards. No
new uses for it will be accepted in Python 3000 no matter how good the
idea.

Ironically enough, I found this link via PEP 3099, which is supposedly about things not changing in 3.x. I guess the decision to remove backticks was made well before PEP 3099 was written, and the relevant section decided to focus on rejecting the idea of reintroducing them…

So ok, no backticks. Would b-strings as proposed by @blhsing be reasonable? Or would keyword be better?

For now I am intending to find out if this is wanted. And if yes, in what form.

On second thought, this approach is actually rather pointless since a caller can’t bind values to names inside the expression of a deferred object, so it’s only good for a late-bound argument, for which @Rosuav already has a proposal with what I think is a cleaner syntax that was apparenty deemed of too lilttle benefit to be worth a change.

If you want a deferred object that can bind values from the caller and be evaluated on demand, that’s what lambda with a closure is for.

So I don’t think the idea proposed by the OP brings anything meaningfully new after all.

This has some problems though, relating to closures. Consider:

x = "global x"

# Simple closures work correctly.
def outer_direct(x):
	def inner_direct():
		print(x)
	x += 2
	return inner_direct

# But eval does not.
def outer_eval(x):
	def inner_eval():
		eval("print(x)")
	x += 2
	return inner_eval

outer_direct(5)()
outer_eval(5)()

When inner_direct is compiled, the fact that x is used from an outer scope changes how outer_direct is compiled. That doesn’t happen with *_eval, so by the time inner_eval() gets called, the value of x has been thrown away.

So how would this hypothetical deferred object behave? Would the mere presence of a single deferred object anywhere in a program force all functions everywhere to be compiled as if they’re closures referencing every variable in them? Because that’s going to be an insane cost.

But in my opinion, the performance question isn’t the biggest killer of this. It’s the possibility for spooky action at a distance, based entirely on the name of a variable. That doesn’t sound like anything I would want to have to deal with.

Yes, that’s what my second thought was alluding to, that a late evaluation of a string-based expression is useful only as a late-bound argument, since the names of the parameters are known to the caller.