The solution you suggest does not work well for sharing objects between libraries, I give an example here:
Python packaging works really well for the use case of allowing third party libraries (like numpy) to provide foundational basis and be the lingua franca for other third party libraries, installing multiple versions of the same library would break that.
Classical “Nay-saying” and Not at all pertinent. In my suggestion, the default mechanism is unchanged. What I suggest is an overloading mechanism. I do agree that with such mechanisms you can always shot yourself in the foot. But at least you have full control and all the dependency conflicts can be resolved if you assume that any released version had a working dependency set. My solution removes the blocking points of the current solution. If you do no matter what, and use a set of dependencies that has no way to work, then it doesn’t work; but it’s “tautological”. The only real drawback of giving this full power to the user is that it may be used by crackers to do no matter what once they have set foot on the PC of someone else. But usually they do no matter what with or without this option. It’s just another thing to know about your system.
Sorry for the harsh tone, but davidism moved my topic from Ideas to Python Help which is inappropriate and looks kind of censorship/downgrading behind good looking noob management.
To give a full answer to your remark on data structures: Your code is your responsibility. When you use library A and library B and their data structures are not compatible, you do more or less boilerplate code to modify data structures so that they communicate through your code. Now, it is exactly the same problem, if you have to make data structures of A 1.1.1 communicate with A 2.2.2. You just do the same kind of boilerplate code. It may be inefficient, but it works. In Python, it may be just looping on the elements of the “in” structure to make it a new “out” structure. Once everything is dealed with on the levels below, you do the required “interfacing” work (not to be confused with the restricted meaning of an interface in OOP).
And there is already nested import in Python. The code, in Python itself, behind it would have to be modified, but the result would be quite close to what we already have.
Go ahead, implement this. Nothing is stopping you from writing a custom loader for your personal use.
Reminder:
requesting projects to change the way their code works is not going to fly
requesting users how they should structure their code is not going to fly
If you want this to be a true solution it needs to work without any performance penalty (ideally without any memory penalty either) and without any code changes except maybe invoking your tool.
You are going to notice that it doesn’t work for anything but toy projects. But if you don’t believe others claims in this regard based on years of experience and previous discussions, then the only way you are going to convince yourself is to try it.
And if against all odds you manage to create a perfectly functional system, great! That would be something word considered as an idea. A half-baked rant that retreats all the same problems and ideas as have been mentioned dozens of times before is not.
That libraries with conflicting requirements should get their own copies of the offending dependencies is the obvious bit. Nobody is too dumb to figure that one out. The question that nobody has been able to come up with a remotely non-naive answer to is how.
It is ambiguous when you say that it is not going to fly. It may mean “You can always ask, but nobody will do the change.” which is probable, whilst the fact that it would not work is false.
For the users having to structure their code, it is only if they need to handle such interfacing problems. So nothing is taken away from users, they don’t loose anything. They just have new solutions: for all dependencies conflicts below, they have a simple solution, for dependencies conflicts in their own code, they have a way to handle it through boilerplate code. It brings solutions to the table, previously they had none other than looking for other packages as dependencies or recode part of a dependency. Nothing is perfect, nothing is free of some burden, nothing justifies that on Internet the majority of persons that react are nay-sayers.
A solution with a performance penalty *notable only when it is used* is always better than no solution. Your argument is a classical fake argument of nay-sayers and doesn’t imply that they always do very optimized code. Moreover, when the Python interpreter resolves an import it must load the function tokens and their adresses to use them afterward; resolving two imports for two distinct versions of the same library would prepare similarly what is needed to use some adress when some function token is parsed without conflicting because of the distinct contexts (or the distinct tokens in the user code), I see no reason that the interpreter would be slowed down apart of the additional tokens to keep in memory; the penalty for the performance and memory would be measured; it would be mainly on the import mechanism that it could be slowed down very slightly in my opinion, and once the import is done, I see no reason for a substantial penalty. If you think otherwise, please explain your reasoning.
Fine, when you say that, you perfectly know that the entry cost on a project like Python is high, and that almost nobody has the time to understand the internals of a project like Python just to make a proof of concept and experience further bashing afterwards. I’m already struggling to redo most of my own Free Software work because of sabotage. Weeks or months of work that vanishes in thin air. I’m always struggling because of crackers and intelligence services and I don’t know who mess with my code to enshittify my life. If I take now two weeks of work to make a proof of concept, I am almost certain that they will not let me succeed or find a way to screw another of my projects during the time. I don’t have yet taken the decision to do or not a proof of concept but you don’t imagine how mad I am against people that steal the life of others and nobody helps.
You do not need to create something deeply embedded into Python. You just need to modify the import mechanism which is well documented, probably by overriding the builtins.__import__ hook and probably a custom loader. This can be done with pure python. I have previously implemented something similar for the purpose of running regression tests by importing two versions of the same library - it worked somewhat, but it had too many surprises for me to ever polish it up.
how mad I am against people that steal the life of others and nobody helps.
I believe you that you are mad - but that doesn’t make you entitled to other peoples time and attention, especially in a tech-discussion focused online forum.
Genuinely, it sounds like you should take a break from Free Software work. You are too aggressive for a tech-discussion focused forum and are interpreting stuff as a personal attack when it wasn’t meant that way. I don’t know your life situation, but I hope you have people in your life who you can talk to away from the internet.
Do you care to explain where are the identified blocking points?
From my point of view, you need the following ingredients:
given current state of the interpreter it is currently executing the code of the following import (package and version)
thus any new import there is resolved by consulting the overloading file for dependencies if any, and backing on the default mechanism (it can move inside the tree of nested dependencies in the overloading in parallel of the moves inside the code, when it is outside of this tree, it only needs to check when it’s back at the root in case further code executed is in an existing branch)
when an import is processed the current context gets the right adresses for tokens,
tokens are resolved with the right adresses in the current context as usual.
I don’t see a conceptual blocking point. If there is an ingredient of an interpreter that is relevant and that I don’t see, please point at it.
They’re already enumerated in the thread that Damian linked.
The only blocking point I would describe as completely unsolvable is libraries with compiled binaries in them. Loading two copies of a C library into one process will usually cause it to crash. On macOS, the OS just SIGKILLs the process.
But there are plenty of other issues that would require either breaking or massively degrading a lot of Python to overcome. As it is at the moment, Python itself has very little awareness of packaging. When you import PIL, it has no idea that PIL/__init__.py came from a PyPI package called pillow or what its version is.[1] This is actually a good thing since it allows Python packaging to evolve at a different cadence to Python itself as well as allowing for alternative packaging tools and all kinds of deployments to exist.[2] Packaging-aware imports would throw all that away.
You’d also have to rip up every assumption that modules can be keyed based on their names. That would include most of the import system, anything that touches sys.modules, pickling, multiprocessing (which uses pickle) as well as anyone else’s extensions or alternative providers for the import system. It isn’t just Python that would have to change.
And then there’s all the costs of people actually using it if it did work. We’d be brining the joys of sticky bugs and sticky security vulnerabilities as well as increased footprint, memory usage and initialisation time to Python.
Personally, I consider this to all be XY problem. If I see one of my dependencies is unstable or has an upper bound version constraint, I throw it out. With that done, I never get version conflicts, rarely need virtual environments and get to ignore all the fancy tooling for these things. Everything is wonderful from then on.
Not perfect but it really shows that a first step (the most important step in terms of needed feature) could be technically done.
I downgraded my spec for this step. But clearly, it is useless to fight for the full feature if this first step cannot be accepted. Let me know what you think of it, please. If you have any constructive comment, I may take some time to improve the POC if it seems worth it. What is clear is that a working PR on CPython by people knowing well the internals of CPython is a work counted in weeks or months, not in years.
My experience is that saying “XY problem” is most of the time a façade for “I don’t care about your problem. It is your problem.”. You seem to have the chance to decide of all the software you need to use. Unfortunately, for all those that need to eat dog food at work, it works differently. When you have to do a simultaneous upgrade of many dependencies, it becomes a huge problem because the CTO or an intermediate level above you never wants to take the responsibilities of the risks of regressions. You may prepare a branch for upgrading dependencies, but it never gets merged with the production branch, because the number of modifications and files impacted is too high. You’re stuck with lagging features and missing bug fixes, etc. And you patch and redo some code that will be useless sooner or later. My POC has PROs and CONs, but at the very least it may give solutions to people needing to upgrade their dependencies in small steps. And then remove my POC of their code base, once the upgrades are complete. Not perfect, it would be better if some architects of Python made a polished version of it. But not an XY problem.
It may often be used that way, to dismiss concerns, but that doesn’t mean that XY problems aren’t real.
Dependency conflicts are a problem. Presupposing a specific solution, and not being receptive to the possibility that this solution doesn’t work, does sound an awful lot like XY.
I’ve worked on inherited projects stuck in dependency hell, at various levels of severity. And I’ve wished for “versioned imports” as a solution. So have other people. So much so that it’s been built at least a couple of times. But then those maintainers have consistently found that it doesn’t work. So now I wish for a better vendoring toolchain, since that’s a solution which I have used successfully several times, and requires no changes to the import system.
I’m receptive in principle to some of out-of-the-box thinking about ways to improve the situation. But I confess I can’t make head or tail of your proposed solution.
Hello :), I understood all your arguments and from having read the cited discussions and from a quick look at
I believe my solution brings new elements to the table, for the following reasons:
no need to modify the Python code in the dependencies, the goal is to overload the given dependencies requirements;
no need to lose anything since packages with a single version may coexist with packages with multiple installed versions; if something works before, it works after;
I’ve focused on “configuration data structures” to make that overloading transparent to the package developper code; I believe this is truly the right approach, it would just need to specify some ENV variables to tell Python where are located the configurations files for the packages that are installed in multiple versions; filling these configuration files requires simple work for small overloadings, hard work for complex configurations, and would benefit from tooling in pip or elsewhere
is a gain compared to most of the previous solutions that needed to precise which version to import in each Python code file, this was not an overloading possibility for the final integrator, but a possibility for the code developper to pinpoint its top-level dependency. My goal was full flexibility for the last person/user in the chain, and my solution achieves that.
Even setuptools that tried hard to make multi-version work had only one active version of package at a time. My POC easily handles distinct compatible versions at the same time.
Clearly, doing the work to clean the details would solve the problem for all the packages that don’t use directly sys.modules, a shared object library or global variables in incompatible flavors between versions. I think that’s the technical limitations that cannot be solved without modifying the packages. If you see others, let me know. And the fact is that most of the packages are not concerned by these limitations. Since my solution is totally opt-in, you don’t lose anything: If it doesn’t work with some of your dependencies, you don’t use it with these dependencies.
I’m curious if the point 3) with needed “configuration data structures” has been well studied since this is the crucial point in my opinion, that makes everything else possible. If you have references, I’ll be happy to have a look at the previous solutions from this point of view.
Maybe the README in the POC GitHub repository isn’t clear enough. I’ll try to be clearer. I say “versioned” packages when a package has multiple versions installed on the system. A system may mix “versioned” packages and “normal” packages with only one version installed.
Step zero: you have some packages that have multiple installed versions. But you want to use only one, like is already done but keep the files on your system:
first configuration data structure/file called versioned_packages.json → This is a dict that takes a package name as key and gives the default version to use.
with this step zero, the modifications to import mechanism are very small, and nothing breaks
Step one: you want that the version of the “versioned packages” may depend of the current package where the import is done:
second configuration data structure/file called default_dependencies_versions.json → This is a dict. Its goal is to give you the relation “Immediate context of import” → “Which version to import”. The keys of this dict are the name of the package if the package is “normal” or the name of the package followed by its version if the package is “versioned”. These keys corrrespond to the package where the import is done. The values of the dict are dicts with keys that are “versioned” packages names and values are the version to use when import is done in a context corresponding to top level key. This second configuration file could easily be filled by pip with the latest installed compatible version to be chosen as import version.
third configuration data structure/file called custom_dependencies_versions.json → This is a dict. Its goal is to give you an overload to the relation “Immediate context of import” → “Which version to import”. This is the responsibility of the integrator to fill this file.
Note that normal dependencies relationship works with a level of precision similar to second configuration data structure/file.
The POC correspond to this step 1.
Step two: handle a larger context than the immediate context to know which version to import.
It was the subject of my initial rant with nested dicts of arbitrary depths.
Not implemented yet.
May be harder to implement without performance penalty.
May bring a mess outside of most of human minds.
It only enhances the possibilities in the third configuration data structure/file called custom_dependencies_versions.json.
Cascade mechanism:
If custom_dependencies_versions.json tells you which version to import, use it.
Otherwise, if default_dependencies_versions.json tells you which version to import, use it.
Otherwise, versioned_packages.json tells you which version to import, use it.
Very simple, if you think in terms of needed “directives”. The rest is technical details.
I really think that most of the persons that had to deal with dependencies conflict would agree that Step 1 is a very good option.
P.-S.: To expand a bit on “May be harder to implement without performance penalty.”. The performance penalty is negligible for a compiled language (with tables of offsets). It is also negligible with Python if most of the work is done in actual computations. But the Python code is littered with imports, and you have to compute the correct version before seaching the module in sys.modules (not exactly true, you can still search for the module in an “unversioned” way before computing the possible version since either the package is “normal” or “versioned” without conflicts in sys.modules. Looking first in sys.modules in a “normal” way yields almost no performance penalty for someone not using the feature.). If the depth of nested dicts in custom_dependencies_versions.json is not bounded, you may have a noticeable performance penalty if you have a lot of imports of versioned packages compared to the actual computations, since needed version computation is before sys.modules “versioned” lookup.
P.-P.-S: Note that you can merge default_dependencies_versions.json and custom_dependencies_versions.json as a configuration step in a file precomputed_dependencies_versions.json in order to decrease the number of dict lookups to find the correct version later-on during Python execution.