Expected behavior for unsupported stdlib modules in the browser

A number of stdlib modules don’t work in the browser as outlined in Compile Python to WebAssembly (WASM) — Unofficial Python Development (Victor's notes) documentation namely multiprocessing, sockets, threading in Pyodide (but not in CPython Emscripten node build). Some of these might work one day via some browser-specific implementation, others will never work.

The question is what should happen when users try to use these modules. For instance, let’s take the example of multiprocessing. In @tiran 's wasm-python terminal importing multiprocessing raises an ImportError. As far as I understand, that’s the standard behavior for unsupported functionality in a given runtime, and allows easy feature detection with try: ... except ImportError:.

The problem is that a lot of these modules were assumed to be non-optional parts of the stdlib for a long time (because all mainstream Python distributions came with them). A lot of packages use them without handling the case where they are missing. If they are missing, because of top-level imports everywhere having this unhandled import error, will likely prevent the import of the package altogether even if e.g. multiprocessing is really not critical for core functionally.

And so in Pyodide what we have been doing lately is to include these modules even if they don’t work.
So for instance, one can import multiprocessing but trying to create a new process would produce an exception. This allows for a large part of the package ecosystem to work without changes (and experiments with exposing multipocessing.dummy as multiprocessing weren’t conclusive). But then it raises the question of what is the proper way to detect missing functionality (most recently in Dask in pyodide by ian-r-rose · Pull Request #9053 · dask/dask · GitHub).

So overall this thread aims to,

  1. get feedback on what the correct behavior should be for these stdlib modules when used in the browser or Node.js
  2. how to adjust the package ecosystem to handle that behavior (if necessary)
  3. and indirectly, to make sure that the fixes people contribute to make packages work with Pyodide would also benefit other CPython WASM builds.

Also @tiran (or someone else) could you please confirm that,

import platform

if platform.system() == 'Emscripten':
    ...

is a recommended way to detect running in Emscripten and will work in the future (it does in your REPL now). Though there should probably be a way to differentiate between browser and non-browser runtimes? We should adapt the Pyodide FAQ accordingly. Thanks!

2 Likes

Thanks for starting the thread, Roman.

The stdlib bundle for upstream’s WASM terminal excludes most Python packages and C extensions that do not work on Emscripten. IIRC I only kept socket, subprocess and threading. Too many stdlib modules depend on the presence of these modules.

try: ... except ImportError: would be the logical way to detect missing modules. But as you pointed out it would break a fair amount of 3rd party packages or requires developers to modify their code. Try/except doesn’t work for threading or subprocess for 3.11. The modules are there but don’t work. Platform checks like if sys.platform == "emscripten": are flawed, too. Emscripten has support for threading, but we don’t use it for side modules and in the browser. Should we introduce feature detection flag like sys._emscripten_info so 3rd party platform can detect if a platform supports subprocesses, threads, and similar features?

The README.md in cpython/Tools/wasm at v3.11.0b1 · python/cpython · GitHub lists ways to detect WASM builds and Emscripten platform from Python and C code. I recommend to use sys.platform instead of the platform module. The platform module is a bit heavier to use than the sys module. The sys.platform values "emscripten" and "wasi" are listed in 3.11 docs, too.

My recommended approach is:

import sys

if sys.platform == "emscripten":
    ...

(Fixed typo in the title of the thread)

3 Likes

You say that these modules were assumed to be “non-optional” parts of the stdlib. Assumed by whom? The stdlib maintainers? Third-party developers?

I’m guessing you meant third-party devs.

If that’s the case, then I’m going to suggest that if they want to support Python in the browser, they will have to learn otherwise.

And I think it will be far, far preferable to have a nice clean ImportError if a module isn’t available, than to have a fake module that doesn’t fail until some arbitrary and unexpected operation.

To give your multiprocessing example, if multiprocessing isn’t supported at all, I would much prefer to handle it in one place:

try:
    import multiprocessing
except ImportError:
    # fallback or fail

rather than to have the import succeed but then have to guard every attempt to use it.

import multiprocessing  # Always succeeds, but sometimes it is a lie.
...
# much later on, after creating numerous classes
...
manager = MyManager()  # inherits from multiprocessing.BaseManager
try:
    manager.start()
except SomeError:
    # Arrgh, now how do I recover from this???

You suggested that ImportError is problematic because it

“will likely prevent the import of the package altogether even if e.g. multiprocessing is really not critical for core functionally.”

Do you have practical examples of this?

How do users of that third-party module know which functions and classes are “core functionality” which can be safely used, even if multiprocessing is fake, and which require a working multiprocessing implementation and so cannot be safely used?

I would much prefer to have a third-party library fail cleanly (until such time as it supports wasm) rather than unexpectedly fail at some unpredictable function call.

3 Likes

If non-working modules are being included to avoid immediate failure on import, it would be good if importing them showed a warning.

2 Likes

Honestly, I’d say “by anyone who read the Python documentation and assumed it was accurate” :slight_smile: The Python backward compatibility policy (PEP 387) is pretty clear that if the docs say that importing and using the multiprocessing module works on all supported platforms, it must continue working on all supported platforms, subject to the provisions for breaking changes in that policy.

While I agree with your point here, it’s more a matter of setting expectations. To use an example I’m familiar with, pip supports the cPython and PyPy implementations, but not Jython. If “Python in the browser” is a separate implementation, we’d decide whether to support it or not, and there’s no issue with that. But if WebAssembly is presented as “just another platform that cPython runs on”, we would typically assume that it works just like any other cPython platform. We would likely run the test suite on it, just like we test on Windows as well as Linux, to catch platform specific differences, but in broad terms, we expect what the Python docs say to just work.

To give a couple of examples, when Debian debundle bits of the stdlib, so we can’t rely on them being present, we take that as an issue for Debian, not for us. Conversely, we only just added some use of threading into pip, because we finally desupported the last version of Python that stated that the threading module was optional[1].

From pip’s point of view, then, if WebAssembly is to be considered as “just another platform cPython runs on”, we’d expect any modules that don’t work on that platform to be documented as such, and that means there’s a bunch of backward incompatibility implications to be worked through.

I honestly don’t know whether we’d try to support pip on WebAssembly. All of the above is very theoretical at this point.

But in practical terms, you’re completely right. Having Python run in the browser is fantastic, and projects that want to support it will do whatever’s necessary to handle the limitations of that platform. Policies shouldn’t affect that (except, hopefully, to make things easier).


  1. And yes, we did get issues from people on platforms that still had no threading support :slightly_frowning_face: ↩︎

4 Likes

I wouldn’t expect that Python-in-a-browser is “just another platform” for CPython to support.

Webassembly is a virtual machine; so Python in wasm is like Python in CLR (IronPython) or the JVM (Jython), or for those with a good memory, Python in the Parrot VM (Pynie).

Threading is the least of all problems for pip. Pip does not work in the browser because urllib and requests are not available due to lack of blocking socket support. Pyodide works around the problem by providing a micropip installer that offloads HTTPS to browser’s Fetch API.

Nit pick: WebAssembly is not a platform. WASM is an ISA (instruction set architecture) just like X86_64 or aarch64. The platforms (operating systems) are Emscripten and WASI. Simply speaking Emscripten is WASM in the browser / NodeJS with a glue layer in JavaScript. WASI is WASM System Interface and designed for sandboxed server processes. WASM provides low level instructions much like Python’s byte code. In theory you could implement subprocess support in WASM. In practice neither Emscripten nor WASI runtimes provide a way to execute a new process.

Emscripten has a pthread emulation layer that offloads threads to web workers. Pyodide and CPython upstream builds disable threading because it’s slow (SPECTRE mitigation) and it causes problems with dynamic loading (WASM side modules).

7 Likes

Everbody.

The modules are not documented to be platform dependent or missing on some platforms. Ergo everybody assumed that the modules are available and working on all supported platforms. While wasm32-emscripten and wasm32-wasi are Unix-like platforms (os.name == "posix"), they are far more restricted and limited than typical Unix-like operating systems.

2 Likes

While I agree with your point here, it’s more a matter of setting expectations.

Yes, a lot of it is about managing expectations. Users should be aware that some modules won’t work in the browser due to environment constraints. Indeed all the socket/network-oriented things (http.client & requests) are the most common issue. At the same time, current Pyodide users expect things that worked before to continue working, and we can’t break those.

Also, unfortunately, users mostly want something working now, not 1 year later, once a fix is applied to all dependencies including those that have a long release cycle or are no longer maintained.

If non-working modules are being included to avoid immediate failure on import, it would be good if importing them showed a warning.

I agree on principle, but if the warning is raised on module import, and the fix is to add a try-except clause to handle when it’s missing, it’s unclear how you make the warning go away if the module is still importable.

Should we introduce feature detection flag like sys._emscripten_info

Good to know that it exists, maybe indeed having such feature flags could be a solution. Even if it’s a bit less nice than catching ImportError

WASM is an ISA […] The platforms (operating systems) are Emscripten and WASI.

Yes, and on top of that there is the runtime (browser or Node.js for Emscripten) which can also impact the modules that work or don’t.

3 Likes

Perhaps we need a linter (extension?) that knows about the missing modules? I believe some already check for modules that are not cross-platform, so it might be a straightforward addition.

Features flags are great, and we have quite a few. I wish we had more :wink: Though in a sense, except ImportError is a feature flag, it’s just that we didn’t all realise that and now we need to treat it as one.

1 Like

There is already a mechanism for silencing warnings within a block of code: with warnings.catch_warnings(): import ... warnings — Warning control — Python 3.10.4 documentation. That way, you’re explicitly acknowledging “there is a warning, and I’m going to do this anyway (or can’t do anything about it)”.

1 Like

I think the question is if the code assumed those modules existed, will they work even with shims? I don’t consider the stdlib “normal”, so the fact it was necessary for the test suite isn’t necessarily indicative of a wider problem.

The situation where the shims are fine is in a large package that happens to support using e.g. threads but it isn’t required for all useful functionality (e.g. some helper function is what you’re really after).

Docs can be changed. :wink: I don’t think anyone should be upset when multiprocessing doesn’t work somewhere that doesn’t allow spawning another subprocess. We can document in the appropriate modules that something isn’t supported in WebAssembly, just like how we document when things are supported on macOS, Windows, and/or Unix.

We could also see how Debian has historically done as they have “lead the way” in slicing up the stdlib in ways that people don’t expect.

2 Likes

:heavy_plus_sign: as there is only special recognition in Mypy for sys.platform, not any other way like os.name or the platform module.

2 Likes

Very strongly agree here - having a clean compatibility break is by far the nicest option here. And for what it’s worth, I expect that packages for which this is at all feasible will be happy to do so! e.g. to build Ghostwriter Demo I also patched Black, and can get back on upstream from their next release.

What would help: for each stdlib module that doesn’t work in the Emscripten (or WASI), document that this is the case and link to suggested workarounds, e.g. a new “Python in the Browser” page.


My short summary of the PEP-387 compatibility argument is that the PEP requires you to avoid removing multiprocessing on e.g. Linux, but (IMO) does not require that you support multiprocessing on a new platform that doesn’t even have a process model. Nonetheless, “The steering council may grant exceptions to this policy.” and for the avoidance of doubt we could propose that they endorse a documentation-based approach as above.

4 Likes

To be clear, I agree here - documenting that things don’t work on a new platform, as part of adding that platform, is perfectly fine (even to be expected). My point was simply that currently, everyone is entitled to assume that modules like multiprocessing are present and work, and in particular framing certain modules like multiprocessing as somehow more likely to be platform dependent than others is incorrect (in principle - in practice people’s instincts won’t be that far wrong, although I was surprised to discover that sockets aren’t necessarily available in the browser, and no-one has yet said anything about whether tkinter and turtle are available in the browser :slightly_smiling_face:).

Going back to the original point of this thread:

  1. I think they should simply not be available (ImportError) and the documentation should be changed to explicitly call out that they aren’t available on specific platforms (we already have this with “Availability: Unix” notations)
  2. This is difficult - at the moment “pure Python” modules are typically built without platform tags, and it’s the responsibility of the library author and users to ensure it doesn’t get used in a context where required modules aren’t available. Initially, the same principle would work for WASM, but it’s possible that we’d need to add metadata to allow installers to flag that a particular dependency isn’t available on WASM. Doing that without disrupting the simplicity of the basic case where code is all Python and should be assumed to work “everywhere” is a non-trivial problem. For the short term, I’d just leave it as something for package maintainers to be aware of and rely on documentation to say whether they are WASM-compatible.
  3. I have no good answer here, other than to point out that as an “outsider”, I don’t actually understand the problem - the distinctions between “emscripten” and “wasi” are far less clear to me than the distinctions between “Windows”, “MacOS” and “Linux”. Hopefully the docs and publicity around “Python in the browser” will start to make that less of an issue over time, though.
1 Like

Once we reach tier 3, we can update the docs for the appropriate modules.

It’s effectively the same thing. Think of WebAssembly/WASM as the CPU and emscriptern and WASI as the platforms/OSs.

6 Likes