I am concerned about LLM code in Python

How does a policy suggesting LLM use is fine in principle not worsen this significantly? (Not intended as a rhetorical question.)

I don’t know of anyone who works entirely on their own. Every experienced developer has seen countless thousands of lines of other peoples’ code, read countless articles and papers in journals and on the web, and so on.

I’m not a lawyer, but my understanding is that “intent” has nothing to do with establishing guilt in copyright violation cases, although may weigh a lot in determining penalties if guilt is established.

Plagiarism is plagiarism regardless of intent or source, even if not consciously pursued or recognized as such by the infringer. There are, e.g., novel code patterns I’ve internalized from decades of experience with some areas of subtle numeric code. I wouldn’t be surprised if someone claimed my use of one violated someone else’s copyright. Hasn’t happened yet, and I don’t expect it to, but it is a risk. One of the tradeoffs I weigh.

How I learned them is lost to time, and I view them more from a patent view (you can’t patent something obvious to those with ordinary skill in the art). But that’s not how copyright claims are viewed. Unwittingly regurgitating forms of expression one has seen is as common as humans, but don’t typically result in copyright infringement claims.

AI bots have “read” more than any human, and typically don’t give any hint about from where they derived their output. Although you can ask a bot to do that - sometimes they can answer, but more often I get back “sorry, I can’t - it’s from my aggregated training data”. In the latter case, I dig deeper myself, or more often just reject the suggestions without further ado.

However, I don’t often ask them to perform programming tasks. “Different strokes for different folks” is fine by me; “one size fits all” not so much.

And no, no bot played any rote in this reply :wink:

It’s the same thing, both humans and LLMs have a similar working mechanism: they learn and then produce. Some humans might reproduce work letter by letter, while others take their own approach, which could still resemble someone else’s work from before. See, it’s literally the same thing. Should we ban thinking too? (No pun intended.)

You plagiarize code you previously saw letter by letter, by accident?

If your assumption here still is, despite all the sources above, that LLMs somehow work like humans or have a remotely similar unintentional(!) plagiarism risk, then I don’t think I can say anything more of use.

I don’t think any of this matches the studies, but perhaps I’m the one reading it wrong.

Possible but highly unlikely. But that’s not the end of it. Copyright violations don’t require literal duplication.

There are many studies, and they don’t all reach the same conclusions, aren’t all of the same quality, and generally don’t meet the "hard science’ criteria of reproducibility by independent researchers. In the Python world, “the chardet case” is highly visible now (any search engage will fill in the gaps - no bot required, although using one would help :wink: ),. Nothing about literal duplication, everything about massive rewriting. Hasn’t been to a court yet, but the massive rewriting won’t (IMO) offer any legal protection against copyright violation claims.

But there aren’t (IMO) dead-obvious answers here to be had. Copyright law is open to different interpretations regardless, and didn’t anticipate the novel issues raised by AI authorship.

Just in case these went missing:

https://dl.acm.org/doi/10.1145/3543507.3583199

How does a policy suggesting LLM use is fine in principle not worsen this significantly? (Not intended as a rhetorical question.)

The same way that a policy suggesting any other kind of contributing is fine in principle as long as the contributor understands the legalities they’re agreeing to.

Sure some are likely to say, “It’s too hard to verify for myself that the result of using an LLM to generate this patch meets the legal requirements to contribute to the project.” Others may just as well say, “It’s too hard to verify for myself that the way I wrote this patch meets the legal requirements to contribute to the project,” regardless of whether they did it with an LLM.

Over the years I’ve personally seen plenty of people try to propose a patch and then give up when they discover that they have to sign an agreement or otherwise assert that they are legally allowed to do so (i.e. not violating someone else’s copyright, breaching an employment contract, infringing on a patent claim, et cetera).

I don’t think this answers the point I brought up, but I feel like we’re going in circles. I still think a policy shouldn’t seemingly encourage a use that benign contributors seem likely to get wrong.

They didn’t go missing :wink: I didn’t reply at length because it’s a bottomless pit, starting with the relevance to OSS projects. For example, the first study didn’t appear to disclose what prompts were used, but the body of the paper gave scattered clues like

as “Visiting York in autumn”, “Walnut allergy”, or “Lemon meringue pie”.

How many original things could an AI say about “walnut allergy”? How many could a human say? (other aspects of “scientific method” include “control group”, “randomized” and “double-blind”).

We’re specifically & only concerned about AI use in the Python project. It doesn’t matter to us if a bot regurgitates a recipe for lemon meringue pie, or has unoriginal ideas for what to do when visiting York in autumn.

I early on found “a study” claiming that copyright violations in the software context involving literal duplication by AI were outnumbered about 10-to-1 by other forms of AI licensing violations. But I wasn’t impressed by the study design, so didn’t share it. And despite that a bot pointed it out to me :wink:

You might find the Co-Pilot hands-on examples useful then, see the clip here (sadly I can’t attach it in the forum): Github comment with clip

That is actually a completely different topic. In math, logic, or when using an API, the solution is usually straightforward.

For example, in a simple logic problem, the answer might just be ‘open the door.’ Most people solve it the same way automatically; it’s not something we approach differently.


The current policy is having the opposite effect. I have read it multiple times, and it has even discouraged me from submitting genuine, non-AI code, creating a chilling effect.

A complete ban on AI would likely produce the same effect on genuine contributions.

On the other hand, you earlier said my suggested text imposed an “unfulfillable” requirement on contributors. Hard to read that as encouraging use.

The policy is intended neither to encourage nor discourage use of AI tools, but to encourage disclosure of use. Because they’re going to be used regardless of what we say about it. I want the users to know up-front about the risks, and especially because newer contributors may not already know about them.

I doubt the typical AI user will interpret it like you. E.g. the Linux Foundation and Apache Foundation have similar policies, well-known contributors seem to think Claude code fits that fine.

I think the problem is the typical AI user isn’t aware of e.g. the Co-Pilot live demo example. You yourself argued the studies seem inconclusive. (This means the “unfullfillable” may not obvious to the average contributor.)

Is this something you want each contributor to read studies about themselves?

Edit: What I mean is, if the python teams thinks all current LLMs are likely unsafe to use, the policy should probably spell that out.

I commented on that before. I expect it would be news to most AI users, but it gave but 3 examples of no-doubt-about-it literal text duplication, and more time to the no-literal-duplication chardet case.

As a whole, quite possibly so. But I don’t think they’re of high enough quality or uniformity for, e.g., the Cochrane group to undertake the considerable effort to do a proper meta-analysis.

I don’t care whether they read studies. I want them to be aware of the real risks, and real benefits, and apply their own best judgment to the tradeoffs they have to make.

You’re certainly doing the community a service by pointing out the risks!

I don’t think the current proposal reflects the risks well.

“This allows reviewers to apply additional scrutiny where it is most needed and helps reduce the risk of unintentionally incorporating derivative material.” This sounds to me like the risks can be mitigated if somebody just looks at the code hard to spot LLM plagiarism.

I simply doubt that’ll help for a suspected 2%-10% rate, with demonstrated longer excerpts.

Should reviewers throw every section into TurnItOn or the like? It just doesn’t sound practical. And these services aren’t reliable either.

(You can argue again that humans can also plagiarize on purpose, but this doesn’t help well-intentioned contributors. I don’t think the policy clearly says e.g. “if you use LLMs even if reviewers look at it hard, you probably will plagiarize without anybody catching it”, yet wants the contributor seemingly on the hook for the consequences.)

Do I make any sense at all? I hope so.

To me that sounds like 3 too many, unless you know a study finding an extremely low rate. The studies above seem to suggest otherwise. It seems at best alarming.

I get the impression that the push for this policy is just for the sake of having one on the books, because others have similar policies. Does this actually stop plagiarism, or does it just make us reflexively suspicious of every new contributor? I suspect the latter.

Well-known contributors are different: they have earned trust through hard work. But we shouldn’t build a system that treats everyone else as a ‘bad actor’ by default. Quick question: Are developers still allowed to refer to solutions on Stack Overflow and GitHub, or do we need a separate policy for every external resource we touch?


I’m not trying to turn this into a legal-theory debate, but enforcing rules based on assumptions is worse than the wrongdoing itself. That’s exactly the chilling effect I’m talking about. Even when I’m acting in good faith, unclear enforcement makes it feel too risky to contribute.

If there were a tool to detect plagiarism that both contributors and reviewers could use, enforcement would be objective. Without that, we’re just guessing.


Coding has changed forever. I don’t remember the last time I built something entirely from scratch without libraries, tools, or AI assistance, much like no one types on a mechanical typewriter anymore. I am not going to refrain from productive and voluminous coding simply because tools exist to help me work efficiently and responsibly.

I think the lack of LLM-assisted tools for review is the root of the problem. High-volume contributions are the new baseline; we need to catch up to the technology, not ban it.

It’s a fact of life that they are on the hook for consequences of their actions, intentional or not, bots or not. As said before, contributing to OSS projects has always been scary on that basis. Although most people give no particular thought to the worst that could happen, and never get hassled in real life.

Part of the tradeoffs everyone has to make by their own lights.

I’m not in the miracle business :wink: Bots will be used regardless of what we say, by some people, almost all of whom will have no ill intent. I’m concerned about them too. There’s nothing realistic we can do to absolutely stop those with ill intent short of not accepting PRs, period. Which is also unrealistic, at a different level.

Do I make any sense at all? I hope so.

Sure! I simply don’t believe you have anything approaching “a solution” either. Banning other things in real life is ineffective too. Start with the “war on drugs” and go downhill from there :wink: Short of an omnipotent Police State, the most you can hope for by banning is to drive people into hiding their banned activities, while giving those activities a kind of “outlaw appeal” that attracts some people. Some people who were never going to “cheat” anyway will be repulsed by the authoritarian vibe. Others without their own strong opinions may forego the use of what would have been material aids, just to comply with stated rules.

Should reviewers throw every section into TurnItOn or the like? It just doesn’t sound practical. And these services aren’t reliable

Bingo! There is no truly effective solution in sight. I’m quite sure CPythoin source already contains copyright violations from before bot days. No ill will, or intent, or even consciousness of copying is required to fall into such a trap. But nobody caught it. And the PSF doesn’t have enough money to attract lawsuits, so no IP vultures are out there searching for things to sue over either.

The best predictor of the future is usually the recent past, and best I’m aware of the number of AI-enabled copyright/license violations in CPython’s source remains 0.

However, we have over 2 thousand open PRs backed up, and it’s another fact of life that it’s hard for a non-core-dev to attract review attention. Core devs are least likely of all to get bamboozled.by AI.

So it could be that the de facto elitism of the PR process is really what’s spared us so far.

I don’t know. It’s a messy, uncertain world, and there are no easy answers,

I’m old enough to remember the FUD spread when version control systems were first introduced. “But if we make it so easy to revert changes, people will become much more careless in the changes they make!” was one of the claims.

Which actually had some merit. But not much even back then. Nevertheless, it may be a revelation to the youngsters :wink: how long it took for version control systems to achieve wide adoption. We’re talking decades here.

In Python’s earliest days, the workflow was this: you sent a patch to Guido’s email address, and he applied it or didn’t all by himself. No, it didn’t scale well :wink:

I’ve run enough experiments by now to convince myself that Copilot is already better at detecting AI provenance than I am. Early on, I tried to trick it by pointing it at code and docs I wrote. No success. It said it looked like I wrote it. Foiled again!

Probably more fruitful to ask it instead to try its hand at detecting copyright/license violations of any provenance. But that’s a big ask of a service I don’t pay for, and it’s not an inherent interest of mine anyway.

Bots will be used regardless of what we say, by some people, almost all of whom will have no ill intent.

If there were an LLM ban, why would the use of LLMs anyway not imply ill intent?