Re-use of standard library across implementations

Okay, I would like to submit a PEP for the idea of categorizing the standard library into subfolders. Is there a core developer who would like to sponsor me on this PEP?

First hi all , and pardon me if i mess up with post rules since i’m new to this. I’m also a portability ā€œextremistā€ so i don’t expect people to even like my position.

You can even go a bit further and say : does the the python standard library belong to the CPython implementation for supported platforms ?

cpython itself can’t even load its own library on android or wasm, an obvious example is ā€œasyncioā€ pulling threading and multiprocessing that are forbidden on WASI/WASM
it can get not only hard to get a stdlib for a new VM but for porting cpython itself too …

Thanks for all the past and future platforms what will never get decent cpython support then. (Pardon me but i will really laugh out loud if micropython node makes it in visual code before cpython)

indeed, that’s exactly why i love micropython VM attitude, C modules are just forbidden to enter core or platform support (called ports) until it’s proven to be really not possible to have a pure python counterpart FIRST even a very slow one.

The stdlib - really - bored me both on Android kitkat and ASM.js - especially asyncio - ( and still is on WASM) and that’s why i embraced micropython and it’s asyncio ā€œlightā€. For writing python tools on win32 i use MSYS2 cpython3 and can’t wait for midipix to be released since i don’t even want to hear of msvc stdlib ā€œfeaturesā€ anymore when i just want to open … an UART in 2018!.

i don’t have time to lose for that i want to have fun, not go back in my old dayjob (useless) porting nightmares.

Alan since you want my opinion, i’d say any stdlib not able to run on a WASI polyfill or strict POSIX ( including aio but without pthread ) is of very low interest for the future (yes i’m looking at you win32 ).

aio:
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/aio.h.html
wasi:

pure python stdlibs are wanted:

https://github.com/pmp-p/pydk/tree/master/sources.em uses a pure python stdlib and threadless asyncio.

We decided some time ago that CPython would require threading support from the OS. This eases maintenance a lot for us. You may take a different stance based on which platforms you are interested in, and it’s fine to use (or even write) a different implementation for that purpose. But I think we’re unlikely to switch back to make threading support optional.

Hmm, so what? Personally I don’t care whether Visual Code (I assume you’re talking about the text editor) has a ā€œmicropython nodeā€ or whatever else. I don’t think you’ll find any CPython core developer who gets frustrated at the idea that other implementations may be selected for specific use cases…

Well, that’s your opinion about the future :wink: Other opinions are valid. For example, personally, I think a platform that doesn’t offer any kind of threading support isn’t very interesting as a general-purpose computing platform.

Well, a bunch of people have been saying that for many years, and it still seems to have trouble happening. Perhaps there’s a reason why. Perhaps it’s not that easy, and there’s a scarcity of people willing to do the necessary hard work. But pressuring CPython developers to do it for you won’t work either.

Sometimes we may rewrite C code as Python code when it helps make the code better-architected, more reasonable to maintain, evolve and think about (e.g. importlib or zipimport). However, in many cases, there is no such concern, and rewriting C code that works fine and that’s performant into pure Python is not our priority.

You may find other projects that have priorities more aligned with yours. Good, so why not use those projects and contribute to them? At the moment, you sound like someone who complains to the Linux developers that they didn’t choose the same priorities as NetBSD…

1 Like

Yeah, as i said at the top ā€œi don’t expect people to even like my positionā€.

But just to be clear as OP is talking of an STDLIB written in Python not cpython written in C and C dependants stdlibs bits.

Sorry but Python3 is Python3 to me i don’t care about VM implementation l leave that to core dev of rust / java / javascript / C / C++ / whatever the VM is.

for my few needs, i can (try to) write my own preemtible stackless vm instead of going complaining in bpo/ideas/else ( but i love to rant loud on IRC some french people know that already).
So far as i’m concerned i use cpython3.7+ when i need it for Panda3d and it does its job to the perfection *

helloworld.c give same result on GNU/Linux or any BSD flavour or wasi platform/ internet browser, but yeah even with my poor C skills i know that’s more about stdio and not C stdlib.

So - and i mean no disrespect, it’s just light mood criticism - please let’s build more cpython toolchains and transpilers for each platform available to have the best optimized low level stdlibs ever with handcrafted bleeding edge Python opcodes pulling direct syscalls no other VM can afford paid dev. to implement.

indeed, i’m aligned with people who want to write and SHARE Python portable code. not put cpython non portable code to the freezer**.

I love all pythons vm equally, but it would be really nice if they could live in the same stdlib.

  • on supported platforms.
    ** except for supported platforms.

@pmp-p I edited your post to remove some inappropriate comparisons, without changing the meaning. Please keep in mind that this forum has a diverse international audience, and keep things professional. Thanks!

2 Likes

@njs, thanks for clarification i’ll try to keep that in mind :slight_smile:

Usually, how is this writing effort coordinated? Do you recommend some online tool?

And: is ok to start writing or is advisable to wait for a core developer sponsorship before start a draft?

Personally, I’d like to see some more detail before sponsoring this. I think writing up a draft so people can understand your approach and its tradeoffs would be important in this case.

1 Like

People typically coordinate on GitHub via a fork of the peps repo.

Depends on what it takes to get a sponsor. :wink: Do make sure to read PEP 1 – PEP Purpose and Guidelines | peps.python.org which outlines the PEP process.

No one has spoken up to say that they would use a pure Python _codecs module in their VM which was my key question, so for that module it isn’t sounding like it’s worth the effort to do, especially as an initial effort to fill in more of the stdlib with pure Python code.

You tell me. :wink: This is more of a question to help scope the amount of work left to potentially go back and fill in the gaps of the stdlib where there’s no comparable pure Python code but it functionally could. Since I’m not going to be doing that work it really isn’t for me to answer what’s needed (I already wrote PEP 399 and importlib so I did my bit in trying to make this sort of stuff easier :smile:).

Thanks a lot for that.

In in the weekend I started hacking something to divide what modules are Pure and what ones are not. However, giving a module, I am still wondering how to force it to skip the C-based version and go for the Pure only. Some tip on this would be appreciated :wink:

I would use on Grumpy at least. This and everything Pure I could find. At least until got enough compatibility to have real useful stuff running, even very very slow.

And @freakboy3742 said they would use as a compliance test for optimized versions, IIUC.

My question is: If someone (like me) contributes a pure subset having what Guido outlined, would this be included in the stdlib and maintained as PEP399 described?

If you want to force imports of C code to fail then you can put None in sys.modules for a module and it will cause ModuleNotFoundError to be raised.

That’s probably a steering council question unless you can get consensus around the idea from the other core devs as it is signing the core devs up for more work to maintain that new code.

But this would allow a Pure version of the same module to be loaded?

In the stdlib there isn’t a name clash for any of the modules that have Python and C code, so no, it won’t prevent it from happening (for instance see cpython/Lib/datetime.py at cf9360e524acafdce99a8a1e48947fd7da06f3d4 Ā· python/cpython Ā· GitHub).

I see. Then I am getting inclined to produce that list of C-only modules, put here to understand what ones are consensual to be included and start by this ones.

Ok… There is no _datetime.py when a _datetime.so is produced. Thanks for the info.

I think the order of preference is in importlib._bootstrap_external._get_supported_file_loaders. If both a source and extensions loader is available, the extensions loader will be preferred.

There are two shortcomings for a pure-python _codecs module, I think these principles generalize to many of the c-extension stdlib modules. At some point it has to make system calls to unicodedb and locale-based encode-decode routines. And I think it also requires access to implementation-specific low-level routines to manipulate partially constructed PyUnicodeObjects (or equivalent).

1 Like

Preliminar summary of modules count by loader, after importing all the stdlib modules listed by stdlib_list, without filtering:

Modules by loader:
Items: 30 		Loader: <class '_frozen_importlib.BuiltinImporter'>
Items: 6 		Loader: <class '_frozen_importlib.FrozenImporter'>
Items: 411 		Loader: <class '_frozen_importlib_external.SourceFileLoader'>
Items: 51 		Loader: <class '_frozen_importlib_external.ExtensionFileLoader'>
Items: 8 		Loader: <class 'AttributeError'>

This sums 30+6+51+8 = up to 95 modules non-pure, of 506 imported (minus the stdlib_list ones).

Excluding the ones raising AttributeError when trying to grab its loader and FrozenImporter that seems to be about the import system, we have 81 C-based modules:

Loader: <class '_frozen_importlib.BuiltinImporter'>

	_abc
	_ast
	_codecs
	_collections
	_functools
	_imp
	_io
	_locale
	_operator
	_signal
	_sre
	_stat
	_string
	_symtable
	_thread
	_tracemalloc
	_warnings
	_weakref
	atexit
	builtins
	errno
	faulthandler
	gc
	itertools
	marshal
	posix
	pwd
	sys
	time
	zipimport

Loader: <class '_frozen_importlib_external.ExtensionFileLoader'>

	_asyncio
	_bisect
	_blake2
	_bz2
	_contextvars
	_crypt
	_csv
	_ctypes
	_curses
	_curses_panel
	_datetime
	_dbm
	_decimal
	_elementtree
	_hashlib
	_heapq
	_json
	_lsprof
	_lzma
	_multiprocessing
	_opcode
	_pickle
	_posixsubprocess
	_queue
	_random
	_scproxy
	_sha3
	_socket
	_sqlite3
	_ssl
	_struct
	_tkinter
	_uuid
	array
	audioop
	binascii
	cmath
	fcntl
	grp
	math
	mmap
	nis
	parser
	pyexpat
	readline
	resource
	select
	syslog
	termios
	unicodedata
	zlib

I can start digging what already have a pure version somewhere, what is ok to have a pure version and what shouldn’t/can’t have one.

1 Like

We had a discussion at the core dev sprint and I think we would be willing to entertain a PR that implements a pure Python implementation of a module to start. So if you want to propose parts of _codecs in pure Python we can have a look and see how this goes (if you want to propose a different module to start then let us know and we can discuss if that module makes sense). Obviously we are after something that is actually being used and thus of high-quality. Tests are also important. And another part of ā€œbeing usedā€ is to do it for things that actually will have use, e.g. implementing a UTF-8 encoder is a bit silly since basically every platform has that natively implemented in some fashion.

1 Like