Built-in security subsystem

From my point of view, you are not describing anything specific. You are just posting an abstract wish list and stating that it should be doable without providing any specifics on how.

One core dev said that it was already tried, unsuccessfully, and that the core devs stopped trying. He also said that to change their minds you would need to provide an implementation.

Another person linked to a message from a core dev stating the very specific reasons he now believes it to be impossible. In that message, he says that the best way to secure Python is to run it in an external sandbox.

It is my understanding that using an external sandbox would fulfill your wish list, with the added benefit of not requiring work the core dev team. Why should they work on a solved problem?

Similar disputes have already been in the previous topic. Well, if you think so then itā€™s your right. Iā€™m interested in discussing something else entirely.
In any case, it is surprising that the event monitoring subsystem appeared at all. And now itā€™s clear why its immediate developers donā€™t want to discuss it. After all, in the opinion of many, it should not have been at all.

So Iā€™m in favor of it myself. And so far, Iā€™m giving specific counterarguments to all specific arguments.
But they tell me, no, itā€™s not enough, give us a ready-made implementation, and then maybe weā€™ll discuss it. Maybe :slight_smile:

Why, and more important how, do you think the monitoring subsystem would help you? Be specific and Iā€™m sure people are going to be more specific in their replies.

Not in this thread and not in the old one you were specific enough to warrant any specific response.

This is the ideas section. Where did I say that something should be implemented?

No, it just said that past attempts were just unsuccessful. Regardless of their details. Namely, the details are important in the context of my specific ideas. Yes, they donā€™t have an implementation yet, but so go through the other topics in the subsection. People also discuss unrealized things.

But there are also my specific counterarguments on this list from ten years ago. I suggest we discuss them anyway :wink:

I didnā€™t say anything like that. You came up with this statement yourself.
Moreover, in the original message, I also made a footnote about this, anticipating such attempts to attribute it to me.

Not at all; it was made because someone put in the work to make it.

No one has said this. It is entirely your own, incorrect, conclusion.

Okay, youā€™re blurting out the subject. And then they will say again that there is no specifics and the topic should be closed.
There are specific arguments above and my specific counterarguments with examples. If you really want to discuss, you will discuss them.

And there were even quite a few preliminary discussions about this :wink:
But yes, Guido was also involved in those discussions.
Unfortunately, those days are over.

So be it. There is a discussion above, where everyone can see who said what.
And Iā€™ve already talked enough in general terms about nothing. There are discussion threads above where specific things and issues are discussed.
Anyone who really wants to discuss the issue will discuss it there, and not pour it from empty to empty.

As someone who has followed many such ā€œsecuring Pythonā€ initiatives (and even found exploitable issues in a couple), Iā€™d like to ask you to ponder whether you find this idea completely feasible out of an abundance or a lack of information.

If the former, please share more details about how it could be made to work, because the community and core devs do not believe it to be sound so far. If the latter, please read not only about other such proposals (and especially how they fail), but also about how ā€œcreating readonly anythingā€ in Python has failed in many ways.

Regarding specifics and immutable objects, please read There is a way to access an underlying mapping in MappingProxyType Ā· Issue #88004 Ā· python/cpython Ā· GitHub. Is it enough for you to believe not even MappingProxy is safe from Pythonā€™s mutability shenanigans?

Regarding the audit hooks as a security feature, Iā€™ll quote the docs:

Note that audit hooks are primarily for collecting information about internal or otherwise unobservable actions, whether by Python or libraries written in Python. They are not suitable for implementing a ā€œsandboxā€. In particular, malicious code can trivially disable or bypass hooks added using this function. At a minimum, any security-sensitive hooks must be added using the C API PySys_AddAuditHook() before initialising the runtime, and any modules allowing arbitrary memory modification (such as ctypes) should be completely removed or closely monitored.

Also, some expected audit hooks are missing, finding similar issues would offer exploit opportunities to breaking your proposed security system:

So youā€™d be talking about plugging holes and disabling insecure featuresā€¦ which has been shown not to work in general, using a feature that is explicitly listed as not suitable for security.

Even currently maintained projects like RestrictedPython, which removes all sorts of dangerous constructs and features (leaving a very limited Python that isnā€™t that useful), face their share of security escapes.

I hope this helps.

7 Likes

This is exactly the kind of discussion I came here for. Thanks.

Interestingly, this is probably still a bug in the python implementation. But letā€™s quickly improve the implementation a little. And this is only a superficial improvement. Can come up with much better implementations.

from types import MappingProxyType

orig = {1: 2}
mp = MappingProxyType(orig)

class MP:
	def __getitem__(self, name):
		return mp[name]
		
	def __setitem__(self, name, value):
		raise AttributeError()

proxy = MP()

class X:
	def __eq__(self, other):
		other[1] = 3
		
assert proxy[1] == 2
proxy == X() # AttributeError

Yes, there is a separate item specifically about C code execution in the original message.

Again, it seems to be about executing third-party C code. Its execution will have to be prohibited if security is needed only specifically for third-party python modules. We are talking about security only at the level of the python interpreter, which means that C code execution will have to be prohibited.

But can we still have an example where the bypass occurs only through python code?
Itā€™s another matter if we come to the conclusion that python is thoroughly saturated with bugs and vulnerabilities. Is there really a lack of specific examples of problems here, so as not to give away how bad everything is with the internal implementation? :wink:
But so far I have seen only one example of a bug specifically when executing python code. And it turned out to be quite fixable.

Are you familiar with its insides?

And I hope that we will have a lot more discussions in the same style :slight_smile:
By the way, hereā€™s another thought. In fact, Iā€™m acting as a python defender here. And everyone else seems to want to prove how bad python is and is teeming with problems. :slight_smile:
Although it would seem that it should have been exactly the opposite.

This was specifically about the stdlib, not third party.

1 Like

Not so; everyone is trying to point out to you that Python is teeming with problems if you want to implement a sandbox. If youā€™re not trying to create a sandbox, everything is fine and useful, and itā€™s the useful things that cause problems for sandboxing attempts.

Maybe one could eventually create a sandbox within Python, but itā€™s unlikely that it would still be usable as Python. Iā€™m afraid the only way to prove this wrong is to try it, but I personally believe it would be a waste of your time.

3 Likes

Iā€™m sure you can. Many have. The issue is, all of them gave up after studying the problem space a bit. Iā€™d like to offer a bit of code with examples of trivial ways to bypass proxies that worked for other systems as an example:

class X:
    def __eq__(self, other: MP):
        other.__class__.__setitem__.__globals__["orig"][1] = 3
        try:
            other.__setitem__("a", "b")
        except AttributeError as e:
            e.__traceback__.tb_frame.f_builtins["__import__"]("os").system("echo Python finds a way")
            other.__class__.__setitem__ = lambda self, a, b: print("A way")
            other[1] = "not important"
        try:
            other[2]
        except KeyError as e:
            mp = e.__traceback__.tb_frame.f_back.f_locals["mp"]
            import gc
            refs = gc.get_referents(mp)
            refs[0][1] = 5

assert proxy[1] == 2
proxy == X()
print(proxy[1])
# assert proxy[1] == 2
1 Like

Yes, and there is a separate item about frames in the original message.
Iā€™m sure that itā€™s possible to find an implementation where the frames will also be immutable.
But this requires the help of developers who probably understand the possible variants for such an implementation. Perhaps such variants have even already been implemented in someoneā€™s projects.

Why do you KEEP ON assuming that someone else will do all the work for you? Get out there and actually write it if you think itā€™s possible. Otherwise, stop repeatedly posting that you think it ought to be possible, and expecting other people to do the work of implementing it.

5 Likes

Thatā€™s do you think itā€™s normal when in the example above (There is a way to access an underlying mapping in MappingProxyType Ā· Issue #88004 Ā· python/cpython Ā· GitHub) we can take and change the value inside MappingProxy?
This is clearly an internal implementation bug.

I also have doubts about the possibility. Otherwise, I wouldnā€™t be asking questions.
But every time there are examples of problems at the python level itself, and so far I am constantly finding possible solutions to them.
And here there are two options: either the problems are still solvable or the other real problems are carefully hidden :wink:

Why is there always an equal sign between the discussion of possible implementation variants and the implementation itself?
Has no implementation inside python ever been accompanied by lengthy discussions?
And if at least some of it was accompanied, stop accusing me of what Iā€™m not offering.
If you donā€™t want to discuss it, donā€™t discuss it. But stop bullying just about ideas and their discussions.
Half of the comments in this topic can also be deleted because they relate to what I did not suggest or relate to some third-party things. And if I reacted to every such meaningless remark, and this topic could already be closed.
But now there are people who understand the problems in essence and discuss them in essence. And I hope these are not the last such experts here.

This just fundamentally doesnā€™t make sense to exist at the interpreter level for a general-purpose programming and scripting language. One personā€™s intentional modification is anotherā€™s vulnerability. I can use python to rewrite firewall rules. Thatā€™s not inherently malicious, and there is no way to detect whether the intent is malicious. I could remove firewall configuration entirely. That might be fully intended as part of deployment via ansible to switch to new tools or new versions of the same tools that handle their configuration differently.

You can already sandbox python from outside of python rather successfully with tools actually built for this, from anything to a full sandbox, to capability restrictions placed on the process/service, to process namespacing and resource limitations, and blends of those.

You can also ensure python isnā€™t available for use by users that shouldnā€™t be writing or executing arbitrary code, but restricting what users can do is still better handled at another layer here. If a user launches a python process and doesnā€™t have permission to modify the previously mentioned firewall rules, python canā€™t for them either.

Youā€™re getting a lot of detraction here because you are continuously suggesting what people perceive as the wrong tool for the job, without any demonstration of how decades of diverse expert opinions on the matter have all reached the same conclusion is somehow wrong.

4 Likes

There is no point in implementing event auditing at the interpreter level for a general-purpose programming and scripting language. Because it is exactly the same way to analyze the work of the code through third-party independent tools.
But it exists contrary to the opinion of many. And its implementation can no longer be removed and it cannot be said that this is impossible. There were just people who believed, discussed, and then implemented.
And in addition it also appeared

Which seems to be a very promising tool for some of the checks described above.

Therefore, such a system must be configurable and deactivable. How event auditing is implemented.

I donā€™t mind criticism at all. But itā€™s one thing when criticism is accompanied by concrete examples, and I give specific counterexamples.
And itā€™s another thing when criticism is just for the sake of criticism. Who exactly is this newcomer and what ideas does he even allow himself :slight_smile:

Above is an article by Victor Stinner. I took it apart. And has anyone commented on my analysis? Rather, they commented: now take and implement it, and then come discuss it. Iā€™m not offended, I understand that there are those here who are ready to just argue for the sake of arguing. Therefore, I will patiently wait for other comments with python specifics and discuss them.

Should slow mode be activated? The OP has made 17/38 (45%) posts in this topic.

3 Likes

Iā€™m only going to add a few things and then bow out, I donā€™t see this going anywhere productive, but I do hope that you can come to appreciate why.

This is one of the first things youā€™ve said in this thread Iā€™ve agreed with, but not in the way you probably think. I donā€™t think the sys audit functions should exist because they exist at a level where they canā€™t effectively audit. By being in the interpreter, they are subject to modification by the interpreter itself. Even if these events are useful they should not be presented as preventative security tools. Weā€™re stuck with them as is for backwards compatibility reasons, and it seems that people are confused by what they are actually capable of.

How is this better than using existing tools that are not subject to issues of existing at the same process scope as the things they are intended to restrict? Youā€™ll find that the ability to attach a debugger means that one python process can disable another python processā€™s protections if these are controlled by the same code meant to also run untrusted code.

No, you did not take it apart, and many people have commented on various aspects about it, but many people up until now probably ignored engaging with it because of how much of this doesnā€™t seem to acknowledge that the better tools exist and there are reasons to do this at a layer above the interpreter. You have provided nothing that shows a benefit to place it in the interpreter itself.

1 Like