I am concerned about LLM code in Python

oscarbenjamin · March 30, 2026, 3:21am

I don’t even know what happened but the first comment in the issue is from a bot and other comments seemed like a human even though it was the same GitHub account. If you go look at the other repo then that same GitHub account is mostly just a bot that interacts with other bots in a strange AI fork of sympy where everyone is a bot. (These people are literally trying to benchmark replacing me and the other sympy contributors with AI in a public repo where their bots tag me in the process.)

That kind of confusion should not be possible though. I should be able to see the difference between the bot and the person controlling the bot. The people in that company have failed to understand some basic aspects of human communication without intending to be malicious. This is the kind of situation where rules are needed and could be enforced but right now there are no rules.

tim.one · March 30, 2026, 4:07am

Part of “the problem”, according to them. That the bot used the human’s GitHub credentials was also unintended (&, according to me, also careless).

That matches what the (presumably, and most likely) human said they were trying to do: investigate how an all-bot “team”, playing at different roles, would do on a fork of an actual repository.

After they discovered the cause, the human did, as they said they’d do, open an issue against GitHub to try to make it less likely to happen again:

github.com/cli/cli

AI agents can unintentionally target upstream repos due to gh defaults

opened 12:27PM - 27 Dec 25 UTC

Benkovichnikita

enhancement gh-repo

### Describe the feature or problem you’d like to solve When using the GitHub C…LI (`gh`) in non-interactive, automated environments, cloning a forked repository implicitly configures the upstream (original) repository as the default target for subsequent write actions such as `gh issue create`. This behavior is reasonable for human workflows, but becomes risky for AI agents, where execution is fully automated, warnings are not shown, and no interactive confirmation is possible. We encountered this while solving **SWE-bench Verified (500)**, where AI agents automatically solve historical verified issues on forked repositories. ### Proposed solution Provide safer and more explicit control over the default repository target for automation and agent-based workflows. Possible directions include: - Allow `gh repo clone` to explicitly define which repository (fork vs upstream) should be set as the default for write actions. - Introduce a single atomic command that clones a fork and sets it as the default repository. - Emit a machine-readable warning or error when a write action targets upstream while operating from a fork in non-interactive mode. Chaining commands like: ``` gh repo clone <fork> gh repo set-default <fork> ``` is fragile for LLM-based agents and increases the likelihood of unintended side effects. ### Additional context We are building a multi-agent system for **SWE-bench Verified**, where agents with different roles (planner, solver, reviewer, etc.) interact with GitHub repositories exclusively via `gh`. Agents can clone repositories, create issues, open pull requests, and review changes. These agents run in non-interactive bash environments, where warnings or prompts are not visible. This behavior resulted in an accidental issues and PRs being opened in an upstream repository during benchmark execution: https://github.com/sphinx-doc/sphinx/issues/14214 https://github.com/sphinx-doc/sphinx/issues/14212 https://github.com/sphinx-doc/sphinx/pull/14218 https://github.com/sphinx-doc/sphinx/pull/14216 https://github.com/sphinx-doc/sphinx/issues/14215 The issue was closed once identified and the workflow fixed on our side. However, this incident highlights a gap between human-oriented CLI defaults and AI-agent-driven automation. Improving this would make `gh` safer and more robust for emerging AI-powered developer workflows, while preserving existing behavior for human users.

If you look at that, it gives links to 5 other unintended issues opened by the bots against upstream repositories in the same timeframe (none of those against sympy, though).

I can’t infer intent. Investigating how bots interact with each other is a legitimate thing for researchers to attempt. Or if you think it was targeted against sympy and/or you specifically, take comfort from that they’ve so fair failed to replace any of the sympy team

As before, I believe your original complaint (a garbage issue report open against sympy) fits comfortably within the “no spamming” rules. Or, more specifically, the rule against

inauthentic interactions, such as fake accounts and automated inauthentic activity;

I don’t see anything there now about AI use. Best I can tell from searching for relevant discussions, they are not opposed to AI agents using GitHub’s services, but do act on complaints about violations of their acceptable use policies. Ranging from a pattern of low-quality PRs to “excessive” bandwidth use.

irvan-putra · March 30, 2026, 8:25am

there is also this space if anyone want to research on digital text forensics & stylometry:

i am still new reading it, so i am still not sure if there is anything practicals to setup

ell1e · March 30, 2026, 9:05am

Hence the suggestion to make a clear rule to indicate you want those humans to stop. Specifically those who think it may be allowed or even encouraged.

(Sorry if I’m misunderstanding. But I feel like this human problem is why I made this thread, unless we’re missing each others’ points here.)

No policy is 100% enforcable, I mentioned this above. But it shifts moral responsibility and often will reduce misbehavior. Or otherwise why even have e.g. any code of conduct? Other FOSS projects like e.g. Qemu felt like it was worth trying.

Sorry, perhaps I might bow out here since I’m repeating myself. I hope some of this was helpful.

elis.byberi · March 30, 2026, 1:36pm

I’m not sure whether this means banning the use of LLM agents as co-authors or banning LLM usage altogether in code submissions. If it only bans co-authorship, then it creates a Cobra Effect, as people will simply use LLMs offline and submit the code as their own. Nothing is gained, and it only makes LLM usage harder to detect.

What should be added to this policy?

tim.one · March 30, 2026, 6:33pm

I think it’s a great start! I’d add text explaining the special dangers of AI assistants violating copyrights and licensing terms via “adapting” existing code bases, whether via literal text duplication or wholesale rewriting. We shouldn’t assume all contributors are already aware of that.

tim.one · March 30, 2026, 7:30pm

Except I don’t want them to stop using AI to help them write code. I want responsible use. AI is a new tool, but brings new potential benefits as well as new potential dangers.

We already have a policy, which you’ve been pointed at several times. I would absolutely add text pointing out the dangers of plagiarism, which I don’t think our policy currently mentions. Probably because the author(s) thought it was “too obvious” to call out explicitly. But contributors come from many backgrounds and levels of experience, and “explicit is better than implicit”.

The policy already points out

 As with using any tool, the resulting contribution is the responsibility of the contributor

Nobody gets a free pass. But we’re focused on outcomes, not on the processes by which outcomes are obtained. I think that’s the right way to go. Of course you’re free to disagree.

I’m grateful you raised the issue(s)! They’re real and important. I happen to be with those who believe “so ban it!” not only wouldn’t help, it would hurt, by driving those who do use bots to work at disguising the origins.

I would instead encourage people who use bots in their submitted work to be very open and up-front about it. That won’t happen if it’s banned, or even stigmatized. A comment saying “The following function was mostly written by Claude AI” would be a high-value clue for more experienced reviewers to focus on potential license violations.

WIthout such up-front disclosure, who would notice? Especially as AI improves, telltale signs of “AI slop” will diminish. My own favorite bot has learned my style so well that I have a hard time looking at my own private code projects and remembering which parts were prototypes written by the bot and which by me.

I haven’t yet contributed such code to any pubic repositories (well, apart from my own), but doing so is inevitable. When the time comes, I’ll give full credit to the bot(s) that materially helped with various parts. If a project rejects the code based on just that, well, that’s their decision to live with. I’ll move on to other projects with more nuanced policies.

tim.one · March 31, 2026, 8:11am

I propose to add something like this to our AI policy. While I wrote the original version of this, I asked Copilot to review it, and got (to my eyes) very valuable feedback to the effect that I also have “too much experience” to make the really scary issues clear to newer programmers. So I gratefully accepted most of the bot’s suggested rewordings, and rejected some others. Any errors that remain are entirely mine. I am, deliberately, trying to change our policies to actively encourage full disclosure.

Copyright and Responsible Use of AI Tools

AI assistants can be helpful when drafting code or documentation, but they also introduce specific risks that contributors must consider. These tools generate output based on patterns learned from large training datasets, and may produce material that is derived from copyrighted or licensed sources. Such derivation can occur even when the resulting text or code is extensively rewritten or does not resemble the original in form or wording.

Because the PSF can only accept contributions that the contributor has the legal right to license, all submitted material must be free of copyright or licensing conflicts, including material produced with the assistance of AI tools. Contributors are responsible for ensuring that their submissions meet this requirement.

To support effective review, please clearly mark any parts of a contribution where an AI tool materially influenced the content. A brief comment is sufficient.

This allows reviewers to apply additional scrutiny where it is most needed and helps reduce the risk of unintentionally incorporating derivative material.

AI assistance is permitted, but it requires careful and transparent use. Clear disclosure helps maintain the integrity of Python’s codebase and ensures that contributions can be safely accepted under the project’s licensing terms.

Paddy3118 · March 31, 2026, 8:29am

I like to do things like tell AI to make module docstrings like a README and add a --doc argument to print it out (For Python and bash scripts). Later on, when creating/modifying in the same space I give a new/ alternate AI session just the --doc of scripts rather than the full contents of scripts then can ask more than one AI questions from a known base and compare/contrast new answers.
You have to read and compare the answers from two AI’s which keeps you involved, and when you are delving into new areas I find you can learn better (and find some hallucinations), by asking about the differences.
It is a longer process, and not always possible if only one AI is approved for more general use in the company, but whilst the approved AI tool has deeper access and generation of code, the other AI can be used as a glorified search with generic, less revealing questions.

Locked-chess-official · March 31, 2026, 9:46am

You are too brave. Sometimes when I use AI to generate the documents, I will get the docs that totally wrong to discribe the function usage and will mislead the users.

Paddy3118 · March 31, 2026, 10:15am

Hardly brave Ive used it multiple times and it seems to work. even when using oneAI session, it can slow down and have too much baggage as it is used to explore alternatives. With the readme’s I can start a new AI session with the readme’s of modules and scripts and work from that reduced base but with some context.

irvan-putra · April 1, 2026, 7:44am

What’s next for this proposal to be accepted?

ambv · April 1, 2026, 12:18pm

Nothing, because you can’t trust PR authors to disclose use of AI tools. This proposal is wishful thinking.

At this point you have to assume every PR is made with AI tools. Anytime somebody discloses usage by a “co-authored-by” field or otherwise, they are extending courtesy towards the PR reviewers. But it’s pretty easy to not do that and pretty naive for PR reviewers to assume a lack of disclosure equals lack of tool usage.

The only way to ensure lack of AI usage at this point is to close public pull requests and vet the core developer team against having AI tools enabled during their work. This is unreasonable and we won’t be doing that.

To be clear, I’m just one guy on the team, and emphatically not on the Steering Council. They might see it in a different way. But as somebody who does do review stuff all the time, I need to think about any PR reviewed these days as possibly adversarial, and this extends way back to the story of the xz backdoor, so it’s not specific to LLM usage.

irvan-putra · April 1, 2026, 12:52pm

Sorry, maybe i was unclear.

I didn’t mean the proposal to ban AI, which seems against the tide here, but @tim.one’s proposal to encourage more AI disclosure and warning to other developers.

It will be then useful marks to check again in the future if the allowance still stay or ban will be needed, for example if there are copyright infringement cases or just more bad quality code correlated with AI usage.

sirosen · April 1, 2026, 3:43pm

I think Lukasz was responding to this most recent proposal.

I’m not privy to all of the details of how the current policy was formed, but I would assume, until proven otherwise, that it was a communal process balancing multiple voices and concerns from within the core dev group. At the very least, the community members responsible for enforcing policies must have been involved.

Trying to “pitch” a new policy without establishing consensus first that some reform is warranted is therefore very unlikely to be successful.

elis.byberi · April 1, 2026, 5:08pm

The mere existence of a policy mentioning Generative AI, without directly addressing its implications, creates ambiguity. Treating it as just another tool either suggests that guidance belongs elsewhere, such as contribution guidelines, or implies that its use is discouraged without explanation.

If the concern is about issues like low-quality tests or code quality, it should be stated explicitly in the contribution guidelines, not in a Generative AI policy. Otherwise, this approach risks sounding as arbitrary as discouraging digital keyboards simply because they allow faster code production. Faster output naturally leads to higher volume, and with it, more errors.

We should also expect an order-of-magnitude increase in contributions, which will likely require using AI-assisted tools to help review them.

ell1e · April 1, 2026, 5:31pm

Perhaps both/the latter?

To me, reducing plagiarized code by discouraging it seems like an outcome.

Sorry, I don’t want to repeat myself so I’ll just point here.

I think this also begs the question who is allowed to contribute, if there’s so little trust.

I guess that’s fair. Sorry that this has been going on for so long.

tim.one · April 1, 2026, 6:09pm

Not proposing trust. Trying to foster a community culture that values disclosure - moral suasion and social norms.

And not in the slightest aimed at “bad actors”. Aimed at less experienced contributors of good will who may very well not be aware of the nuanced issues. We have contributors from many backgrounds and experience levels. Related before a case, before bots, of a contributor who copied code from glibc - and plainly said so in the patch! Not malicious, they simply didn’t know any better. BTW, they went on to become a core dev.

Well, I’m not that naïve And don’t expect that any other core devs are either.

Possibly, but not that I’m aware of.

How could consensus possibly be achieved without floating ideas in public first? That’s what the OP did with their initial post here, and that’s what I did with my “I propose to add” post, giving something very explicit to discuss.

If it wasn’t clear, I do not propose to change any of the current policy, but to elaborate (e.g., the current text doesn’t, as I recall, even mention the AI-amplified dangers of unintended copyright or license violations), and encourage (not mandate) disclosure of AI tools.

sirosen · April 1, 2026, 7:10pm

No need to be sorry! It’s not my intent to stop conversation. But I also feel like we should be pretty clear with one another about what kinds of outcomes we expect.

I’m not sure that anyone is convincing anyone at this stage.

I don’t want to get into splitting hairs about intent here; only that it didn’t scan that way to me as a naive reader of the text.

Are we making any progress in this thread in establishing consensus that something in the CPython policy needs to change? I don’t see that. I see a relatively small pool of participants.

I get that sometimes putting a concrete proposal out there can break through deadlocks and help us progress. But given that we haven’t established broad agreement on the prereq of “we should change the policy to address X”, I think you can understand why I’d read a post that starts “I propose to add something like this…” as jumping ahead.

tim.one · April 1, 2026, 7:35pm

No offense intended, but I rarely engage with meta-discussions (discussions about whether something “should be” discussed). In my experience they’re almost never productive, and often lead to flame wars.

Low participation is par for the course regardless of topic - unless they veer off into meta-land

On the face of it, it’s clear to me that a policy that doesn’t even mention copyright/licensing issues in the context of AI use needs elaboration. The dangers are serious and real, AI use has certainly made violations more likely, and It’s a very hot industry-wide topic now.

And that’s all the 'meta" I have in me to say.