PEP 768 – Safe external debugger interface for CPython

cfbolz · December 12, 2024, 7:18am

That’s an excellent idea, I’ll go and add that to PyPy, sending the code as an argument. Any ideas for the name of the event? remote_exec maybe?

steve.dower · December 12, 2024, 4:04pm

I’d pick one of the names already being used, so perhaps debugger_script or debugger_pending_call would make sense. (Or PyPy can add pypy.whatever if your names don’t look like those in the PEP.)

gaogaotiantian · December 13, 2024, 1:03am

A great PEP for devtool developers

pdb needs some polishing for this feature of course, and we probably will have a few difficult issues:

Completion won’t work
How can we sync between two processes? Try to read as much as possible from FIFO might not be stable - what if I set a breakpoint which does not hit until 5 seconds later? Can we solve this with both processes run in sync mode (single thread)?

However, that’s future discussion and probably not directly related to this PEP.

I do have a question about the PEP. I believe the new Python interface is part of the PEP, how do we know which thread picks up the injected code? For now, pdb will only stop at the thread where it was brought up so if it’s a multi-thread program, we might not be able to debug the thread we are interested in. Should this be resolved in this PEP or future improvement? I believe it’s not only about pdb, other tools have the similar issue. From the PEP it seems like it’s possible at C-level, but not Python (it’s probably also difficult to specify).

pablogsal · December 13, 2024, 1:20am

Indeed. The scope of the PEP is set up the foundation first and the we can sync on how to adapt PDB to offer a great experience in an issue as that won’t require a full fledged PEP.

I believe the new Python interface is part of the PEP, how do we know which thread picks up the injected code?

Ah I think this has been an oversight of the document. We have access to all the thread states and we can select in which one we want to install the code. You could pass down maybe the TID or any other form of thread ID. I think that we want to offer the possibility of specifying a specific thread but the default should be “whatever thread runs first” to guarantee faster attaching.

We will briefly discuss this and we will add this in the next PEP update.

gaogaotiantian · December 13, 2024, 1:39am

Thank you for the explanation, that sounds very reasonable. It might not be easy for users to know the thread ids of a process (unlike pid which is easily accessible). Probably be helpful to at least have an option to trigger on main thread. Anyway, I have no doubt that you’ll come up with a good solution. Looking forward to it!

Zheaoli · December 13, 2024, 3:57am

Hi, Thanks for the great PEP. I think this will help the Python debugger more useful.

But I have a personal thought.

Write control information:

Write a string of Python code to be executed into the debugger_script field in _PyRemoteDebuggerSupport.

Set debugger_pending_call flag in _PyRemoteDebuggerSupport

Set _PY_EVAL_PLEASE_STOP_BIT in the eval_breaker field

If I’m correct, we will need some special syscall like ptrace in Linux when we inject the code.

I think this will limit some usage especially in the container environment when it’s not a privilege container(need more system config)

I’m not sure we can use some mechanism more common like socket API to reach the inject code target

Zheaoli · December 13, 2024, 4:09am

Just for other thought

I’m not sure we need to discuss the feature should be enabled default or not.

I think this PEP will make the action more easy when we do something bad.

Zheaoli · December 13, 2024, 4:36am

For more detail

The new PEP introduces a distinctive process characteristic through the PyRuntime structure location mechanism. Combined with existing serialization vulnerabilities (like unsafe loading in PyTorch), this could potentially enable concerning attack scenarios:

An attacker could distribute a malicious model that, when loaded, would:

Enumerate processes on the target system
Identify Python processes using the new PyRuntime signature
Leverage remote code execution capabilities to extract sensitive data

This significantly lowers the barrier for executing malicious operations without process interruption. The deterministic nature of the PyRuntime location mechanism, while beneficial for debugging and monitoring, could inadvertently simplify process targeting and manipulation.

The implications for security infrastructure and defensive measures should be carefully considered before implementing this change.

In my thought, I think this PEP should be disabled by default to void the expose process feature

pablogsal · December 13, 2024, 6:21am

@Zheaoli i think you have misunderstood several aspects of the proposal.

If I’m correct, we will need some special syscall like ptrace in Linux when we inject the code.

No, the proposal is NOT injecting code in the remote process by executing assembly. This is covered in the document. The whole point of this proposal is to not have to inject assembly like normal debuggers do.

The new PEP introduces a distinctive process characteristic through the PyRuntime structure location mechanism.

The PEP is not adding that, that is already there

Leverage remote code execution capabilities to extract sensitive data

This step requires privileges that if the attacker has you already lost because you can extract whatever you want, the PEP is not changing this in any way or form and is not altering the security profile. This is covered in the text.

The deterministic nature of the PyRuntime location mechanism, while beneficial for debugging and monitoring, could inadvertently simplify process targeting and manipulation.

Again, PyRuntime is not new, and this is not something we are adding in the PEP

In my thought, I think this PEP should be disabled by default to void the expose process feature

It will be not be disabled by default because the idea is that users can benefit from it when they need to debug their code or profile it. I think that unfortunately you have misunderstood the proposal in several ways, including the security profile and what the PEP actually proposes.

Zheaoli · December 13, 2024, 7:56am

@pablogsal Thanks for the reply. I think we both got some misunderstanding here

I have re-read the PEP. I will explain what I’m concerning about(all content base on the Linux environment)

On the Linux, we use process_vm_readv() and process_vm_writev() syscall to write the debugger_script field. It’s need CAP_SYS_PTRACE and root privilege.

Ideally, we need to run the process with different privilege base on the function we want get. But for most common circumstances, people always run a lot of processes under root. This is bad but it’s wired used.

This PEP will save the attacker’s cost in fact. They don’t need to care about the memory detail anymore. they just need to attach to other processes(the lost may expand).

So I’m not sure whether we should disable it by default. I prefer disable it by default or support a compile flag to disable it

pablogsal · December 13, 2024, 8:03am

We will likely add a compiler (and maybe runtime) flag or env var to disable it, and this is subjected to the auditing interface as well.

markshannon · December 13, 2024, 10:30am

I think this is an excellent proposal and should get rid of ongoing the friction between external tools which want internals to be exposed, and our optimization efforts which want internals to be hidden.

I do have a few comments/concerns though.

This PEP could be subtitled “Making remote code execution easier for fun and profit”. Making it easier to attach tools will make attacks easier. It maybe outside of the intended scope of the PEP to consider security, but I think it needs to be discussed in the PEP.

Why load the code from a file, instead of reading it directly from the buffer?
If you want to execute from a file something like exec(open("pathname").read()) should work.
For small commands in an interactive debugger, having to write to and then read from a file may make things less responsive.

The reference implementation does not AFAICT set a flag in the eval breaker, but relies on it being triggered by some other event. I assume this is an oversight.

markshannon · December 13, 2024, 10:47am

Each thread uses between 2M (I think) on Windows to 8M (on linux) for the C stack. The Python stack takes another multiple of 64kb. An extra 4kb won’t matter.

In fact, if the buffer is a single OS page it won’t use any physical memory until written to.

Another, slower approach would be to allocate the memory on demand.
This would require the tool to make two requests initially: the first to request the buffer, the second to run the code.

steve.dower · December 13, 2024, 10:49am

I think you might be too focused on the old exploitation mindset where the assumption was that the only things that matter are the ability to exploit (not the ease or reliability) and the immediate benefit (i.e. if it doesn’t directly grant escalation of privilege, it doesn’t count). This isn’t how we approach security these days. Both @Zheaoli and @markshannon have valid points, and probably I could have pushed the point harder when I brought up the compile-time option.

The key thing this interface exposes is allowing an attacker to hide very easily. Assuming they have found a way to execute some amount of code on your system, unrelated to this interface^[1], their goal is to persist by launching another process that won’t be detected by any defences. That process is already commonly Python, which is why some of us are investing in ways to ensure that an installed Python can only run pre-approved code, and not whatever an attacker has smuggled in.

With an always-on-by-default code injection interface, an attacker can now use a smaller amount of code and more reliably persist inside of any running Python process.^[2] While it doesn’t obtain them additional privileges, it does now put them in a position where they can run all sorts of attacks without having to reuse the original exploit. And often scraping the environment and exfiltrating secrets is enough - Python makes this embarrassingly easy.

All of this is to say that when people want the ability to disable this, they have very legitimate concerns. They’re not calling the interface itself a vulnerability, but it does make things so much easier for an attacker that disabling it would be preferred^[3]. Having it disabled by default and enabled by an -X option or environment variable brings it to a safe enough level - if an attacker can launch Python with their own options, they can already abuse us. But we don’t have to freely offer them the ability to inject into a running process.

e.g. maybe they’ve uploaded a malicious PNG to a system that uses a vulnerable parser. ↩︎
And smaller is relevant - if the PNG is limited to 5MB, you may not be able to get that much code into your hypothetical buffer overrun. ↩︎
It’s an instance of “you don’t have to outrun the bear, only the other people running from the bear”. ↩︎

steve.dower · December 13, 2024, 10:52am

Sure, you can say it won’t matter, but so will the next person, and the next after that. Eventually, it does matter. Some people count bytes - just this week I had someone asking why their memory usage had “jumped” from 110MB to 120MB after switching a library.

We can also bring the C stack size right down by fixing the recursion limit It’s only as big as it is because of those bugs. And I would love to see the ability to configure stack size when creating a new thread/interpreter (but I’m not pushing that yet because of all the other work going on around initialization).

(Edit: Specifically, stack size on Windows is currently 3MB because of Stack overflow collecting PGO data on Windows · Issue #113655 · python/cpython · GitHub, and as we expect people to create more threads if free-threading takes off then we’ll want to make it smaller)

pablogsal · December 13, 2024, 11:40am

The key thing this interface exposes is allowing an attacker to hide very easily. Assuming they have found a way to execute some amount of code on your system, unrelated to this interface, their goal is to persist by launching another process that won’t be detected by any defences. That process is already commonly Python, which is why some of us are investing in ways to ensure that an installed Python can only run pre-approved code, and not whatever an attacker has smuggled in.

Yes but also no: the code execution will be preceded by an audit interface. I could argue the other way: if an attacker manages to get code in this system then is the worse case for them because it will be super noisy as the audit system not only will show it but it can react to it stopping its execution. Any other exploit will be silent and there is no chance for the interpreter to react, this will.

All of this is to say that when people want the ability to disable this, they have very legitimate concerns. They’re not calling the interface itself a vulnerability, but it does make things so much easier for an attacker that disabling it would be preferred.

I understand your point and I agree that being able to deactivate it is a must. I am happy to add many ways to deactivate it in different times and in different ways including a compile time option, environment variables and more

But I respectfully disagree and I must insist that this must be activated by default so users can get the benefit. The key points are:

It you can attach using this you can also attach using gdb or something else and you can force the execution of whatever you want. This is not making it more insecure and does have exactly the same privileges as any other debugger.
For threats inside the process I can argue that it you can write to arbitrary places in memory then there are already many other ways to make the process do whatever you want. From changing the function evaluator pointer to redirect the allocator, to the signal handler array, … What’s worse: any other way to force the process to do what you want will be invisible but this procedure will be gated by the audit system so is very noisy.
Not having this by default will greatly reduce the usefulness of most tools and will greatly hurt the development experience as things will not work by default. Users already expect their vscode debugger or pycharm debugger to attach without special modes.
Adding activation flags is not always easy as many times users do not control the way the process is executed.

pablogsal · December 13, 2024, 11:41am

The PEP has a whole section dedicated to security

pablogsal · December 13, 2024, 11:42am

I think you are confused: the PEP proposes loading it from the buffer and the idea to load it from a file is in the rejected ideas section

pablogsal · December 13, 2024, 11:44am

Yeah apologies, the reference implementation is outdated, I will update it soon

Meanwhile what the PEP says is the authoritative version

markshannon · December 13, 2024, 11:45am

Apparently the reference implementation is not to be trusted

Comparing 60ff67d010078eca15a74b1429caf779ac4f9c74…remote_pdb · pablogsal/cpython