Agreed. I don’t think that there can be any going back to a time when no one uses AI tools and having people lie about using them is also a problem.
Anthropic did a study in January where they compared people doing a task that involved learning to use a new library (Trio) in a randomised control trial with one group using an LLM chatbot and the control group not using any AI tools. The headline conclusions are fairly predictable but the part I found interesting was what it says in the paper about the pilot studies they did before the main study.
In the first pilot study they recruited 39 junior software engineers and divided them into the AI group and the control group and the people in the control group were asked to spend up to 30 minutes writing code without AI. The authors concluded that one third of the people in the control group had used AI anyway even though they were told not to.
In the second pilot study they recruited 107 people. This time they tried to make the instructions clearer like “literally we are just paying you to write code for 30 minutes without using AI so please do NOT use AI”. Apparently still a quarter of the people in the control group just said “yeah, but I’m gonna use AI for that”.
There was no material incentive in these studies for people to cheat using AI and it was literally the one thing that people in the control group needed to not do because it would poison the data undermining the entire purpose of the study.
This shows basically what I would expect to happen if banning the use of AI tools. People would just do it anyway and lie about it and then there would be endless arguments and accusations.
I think any large open source project is going to need to have some kind of explicit policy statement about generative AI use though.
It is not the same. The LLM is a tool that can do various things. The fact that it is trained by looking at lots of code does not mean that when using it someone is always copying that code. The linked commit is not going to be something that can be copied from anywhere except in so far as the code is largely just more examples of the repetitive patterns already seen in the surrounding code. Whether someone used an LLM or not to produce that commit it does not constitute copying from anything non-generic. I expect that the author would have told the LLM quite precisely what to do but then the LLM automates repeating the patterns needed to complete the work.
It is true that someone could use an LLM-based tool and end up unknowingly having something that is a copy of something else. A policy should aim to prevent that and other problems that can come from LLM-generated code but an outright ban is just impossible and discouraging something like the particular commit highlighted would seem like a bad outcome to me.