How can async support dispatch between sync and async variants of the same code?

Problem

A common scenario for library authors is that they accept some callable as a callback for user-defined logic.

If the library author wants to add support for async methods, some high-level changes are usually needed, but there’s a problem which ends up percolating down through all sorts of utility functions.

Rather than toy examples, I’ll use some of the code I’ve been working on in a branch of the webargs library:

    def _load_location_data(self, *, schema, req, location):
        loader_func = self._get_loader(location)
        return loader_func(req, schema)

    async def _async_load_location_data(self, *, schema, req, location):
        loader_func = self._get_loader(location)
        if asyncio.iscoroutinefunction(loader_func):
            data = await loader_func(req, schema)
        else:
            data = loader_func(req, schema)
        return data

Both of these functions are “the same”, but we need them both. This isn’t so bad for a single function, but stack up a few distinct hooks and methods, and you end up effectively doubling the size of a lot of the plumbing in the project to allow a completely async call path alongside the sync one.

Existing solution for decorators

The interface provided to users sometimes needs to show the difference between the two versions of the same code, e.g. Parser.parse is sync, Parser.async_parse is async. That happens anywhere that the library exposes a bare function call which must become async-capable.
But we can hide it a lot of the time using a decorator and a quick check:

def decorator(func):
    if asyncio.iscoroutinefunction(func):

        @functools.wraps(func)
        async def wrapper(*args, **kwargs): ...

    else:

        @functools.wraps(func)
        def wrapper(*args, **kwargs): ...

    return wrapper

This works great for cases like

import flask
from webargs.flaskparser import parser

app = flask.Flask(__name__)

@app.route("/foo")
@parser.use_args(...)
async def foo(...): ...

I am therefore not super interested in trying to find a better way of presenting an interface for users to call sync or async variants of library code. Decorators solve this pretty well where we can use them. And it’s okay to have to support foo() and async_foo() as different entry points into library code where necessary. The problem is that it’s not just a matter of having foo() and async_foo() at the top level, but a “shadow copy” of your code inside the library to keep the sync and async paths separate.

Past discussions

This issue has been discussed before, in particular

both seem relevant.

However, I don’t see anyone asking for what – as a library author – seems like the best solution:
Is there a way in which the language could be changed such that building the async and non-async variants of the same function could be automated or simplified?

If there’s another past thread I should read, please let me know.

Ideal solution

Today I have this:

class Parser:
    def _load_location_data(self, *, schema, req, location):
        loader_func = self._get_loader(location)
        return loader_func(req, schema)

    async def _async_load_location_data(self, *, schema, req, location):
        loader_func = self._get_loader(location)
        if asyncio.iscoroutinefunction(loader_func):
            data = await loader_func(req, schema)
        else:
            data = loader_func(req, schema)
        return data

    async def _async_other_helper_func(self, ...):
        return await self._async_load_location_data(...)

    def _other_helper_func(self, ...):
        return self._load_location_data(...)

    def public_func(self, ...):
        return self._other_helper_func.call_sync(...)

    async def async_public_func(self, ...):
        return await self._other_helper_func.call_sync(...)

and what I want to write instead is this:

class Parser:
    maybe_async def _load_location_data(self, *, schema, req, location):
        loader_func = self._get_loader(location)
        if (
            asyncio.iscoroutinefunction(loader_func) and
            MAGIC_is_currently_async
        ):
            data = await loader_func(req, schema)
        else:
            data = loader_func(req, schema)
        return data

    maybe_async def _other_helper_func(self, ...):
        # other magic -- strip the await in synchronous calls
        return await self._load_location_data(...)

    def public_func(self, ...):
        return self._other_helper_func(...)

    async def async_public_func(self, ...):
        return await self._other_helper_func.call_async(...)

    # why limit 'maybe_async' to internal methods?
    # if it's part of the lanaguage, we also get to avoid the split in public
    maybe_async def alternative_public_func(self, ...): ...

I’m aware that some of this could be done with code generation. However, maintaining maybe_async codegen would be quite difficult for any individual library maintainer. Certainly harder than finding ways of sharing code between my own internal sync and async variants of the same set of functions.

Conclusion and final question

Is there a solution which can be written to do the above (obviously with less syntactic sugar) in the language today? Or would this require language changes as I think it would?

The goal is to improve library maintenance. So adding runtime dependencies on other pypi packages or very complex solutions don’t really solve it.

Are there known techniques for doing code-sharing between the two paths which make this problem less severe? Perhaps some clever method of passing around and chaining calls on object which may be awaitable?

1 Like

Not to imply any opinion on the proposal as I’m not well-informed on the topic, but you might want to at least consider moving this to the async category, which might reach more experienced async-using devs, before proposing here. But that’s your call, ultimately.

Minor sidenote, but it looks like both of your branches appear to be identical. Did you mean to omit the async keyword in one, or make some other change?

I wasn’t really sure. async-sig seems very quiet, relative to the higher-traffic Ideas forum. Maybe that’s a positive reason to use async-sig? I’m happy to move this, if that’s possible on discourse.

Exactly that, thank you for the catch. I’ve adjusted the example to drop async in one branch.

Its up to you; if so, you can use the Edit (pencil) button next to your post title, and then change the category in the dropdown to the left. It should be possible for regular users on their own posts, but I can do it for you if you’d like, just in case its not.

I think the reason this hasn’t been solved in its full generality is that there’s no perfect solution. Adding another keyword to the language (maybe_async) just isn’t in the cards.

Library and framework authors are usually best off having an opinionated convention aided by a decorator or metaclass fitted to the needs of the library or framework. (I believe I’ve seen a metaclass that looked for methods named async_spam and added a synchronous version named spam for each such.)

I’m not surprised; if it were as “easy” as adding a keyword, it would probably have been part of the original design of async. But if people are finding their own ways of achieving this today, is there any possibility of getting one of those conventions + helpers into the stdlib?

The following is as good a solution as any other if it works:

async def async_spam(): ...
spam = create_sync_variant(async_spam)

But I don’t know of a way to do that. Generating a sync variant from an async function is a good step, but it’s only part of the problem. If we have

async def async_spam():
    return await async_eggs()

and create_sync_variant renames and strips the await, we’d get

# the generated function from create_sync_variant(async_spam)
def spam():
    return async_eggs()  # <-- but we wanted eggs() !

I would be tremendously thankful for links to any existing tooling which does this, just for the purpose of learning. I’ve already looked at asgiref a bit, but it seems to mostly wrap calls in background threads.

Maybe there are not that many people who would benefit from the addition of create_sync_variant? It seems that a lot of people struggle when joining together async and sync code, but perhaps not in this particular way.


On the other topic, I’m not able to move this to async-sig. I get “You are not permitted to view the requested resource.” Perhaps I’m not allowed to post there?

I moved this thread to Async-SIG.

On the main problem, I feel like I have to repeat myself – you’re better off inventing your own solution that works right for the framework.

One trick I’ve seen is a decorator that takes an async function, and adds a function attribute (e.g. named ‘sync’) that is a wrapper that calls the async version and waits for the result. E.g.

def add_sync_version(func):
  assert asyncio.iscoroutine(func)
  def wrapper(*args, **kwds):
    return asyncio.new_event_loop().run(func, *args, **kwds)
  func.sync = wrapper
  return func

I haven’t tested this version and there are dangers associated with creating a new event loop for this purpose, but you get the idea.

I should probably add how this is used.

For the library developer, you just add @add_sync_version to those (public) async functions and methods for which you want to add a sync version. E.g.

@add_sync_version
async def spam(): ...

For the user of the library, if they want the async version they can just write

    x = await spam()

If they want the sync version they can write

    x = spam.sync()

Thanks for moving this to async-sig! :slight_smile:

I don’t want to give the impression that I’m not listening, and I apologize if I said anything to suggest that.
I’m trying to understand what the best way of handling this scenario is. And I’d like to codify that – perhaps in asyncio docs or somewhere else appropriate – so that anyone else trying to do similar things has that same best practice available.

If the solution were as simple as a 6 line function, there would be no reason not to add it to asyncio. So those hidden dangers are actually the hard part. Am I at least following the situation correctly up to this point?

If nothing else, we have to be concerned about the caller already having a running loop. asgiref’s AsyncToSync uses a background thread to run a separate loop and clocks in at around 200 LOC. Has that team found a safe and reliable workaround for most cases? The underlying question is: is it impossible to hope for AsyncToSync to make it into asyncio?

Yeah, you’re following; the reality is messy, and that’s why we don’t want to put a solution in the stdlib – there are different compromises possible and you will have to choose based on the characteristics of your library. Indeed, you may have to offer a less than perfect solution and warn your users about possible downsides. Ultimately it’s better to wean your users off synchronous calls altogether.

Thanks for pointing out asgiref @sirosen. I had come up with something similar while exploring how to support both sync and async classes for MongoDB and Jupyter Client: Wrap an Asynchronous Class · GitHub

1 Like

I want to support sync and async usage without introducing potential fragility just to save me lines of source. I would have thought this shows up for all sorts of use-cases. If any HTTP-based client lib (elasticsearch comes to mind) wanted to support use of aiohttp, they face the same issue.

Maybe my case is unusually bad for sync vs async. I don’t think I can use the background thread strategy because one of the contexts in which I want the library to work is under uwsgi, where threading is often disabled. I could say “you need to set --enable-threads or make your application async”, but I don’t feel that I can justify that demand of users.

If the goal is to be “async capable” and play nicely with whatever stack users are already using then I don’t think it’s wise to make a previously synchronous library fully async.

However, it seems like the background thread strategy is a common one. At least, it has been invented independently twice! Is there room for documenting this strategy as part of the asyncio docs, in narrative doc like the logging howto?

You may also be interested in some of the writeup at Network protocols, sans I/O — Sans I/O 1.0.0 documentation

3 Likes

Do you see a future where most (all) of Python moves towards async/await as a general model? Or is this just a recommendation to decide upfront whether to use plain python or the async python version?

1 Like

I do not foresee such a future. The synchronous model is here to stay. Python is not JavaScript.

Async has a place, and sometimes you need to convert from sync APIs to async APIs for a particular scenario. But trying to offer both at the same time is fraught with difficulties and is at best seen as a transitional approach.