Community policy on AI-generated answers (e.g. ChatGPT)

Even if it might, or might not, suppress the frequency of direct incorporation of ChatGPT responses into posts on this forum, wouldn’t this proposed feature effectively promote and even appear to advocate the naive practice of uncritical solicitation and utilization of advice from large language models (LLMs)?

1 Like

Unfortunately, misplaced uncritical beliefs of users in LLMs will happen regardless of what forums officially choose, due to the automatism and anthropomorphism biases. I have worked in the past on military drones, and these biases have been well documented there already: even trained military crews, who are highly trained, have been reported to risk their lives to carry back damaged mine clearing robots in a dangerous field.

I may have a bleak outlook on the topic, but as a AI researcher, I see no way non-AI researchers can have a good enough grasp of such a highly intricate and polymorphic topic as to not unconsciously get their perception of reality biased by AI models that are now ubiquitously available (and even before ChatGPT, there were algorithmic bubbles everywhere online due to recommendation algorithms). There are already tragic examples, such as a belgian health researcher ending their life (hence someone with a fair bit of education in a developed country) because of a GPT-J based chatbot.

What I am suggesting is to commoditize these tools so that humans can learn to tame them, for example with a clear explanation of what they do, and with a clear UI distinction between AI-generated answers and real human answers. I am not suggesting as you seem to imply that AI-generated answers would just be posted on the forum, but rather suggested in a very distinct UI field, eg, like the GitHub system of recommending duplicate answers when creating an issue, it could also suggest a succinct AI-generated answer. In fact this would be a very similar UI experence to summarization algos.

Yes, the lack of a concept of correctedness is a major issue, there is no formal guarantee, but other algorithms such as summarization algorithms are widely used (even here) and they also lack such a concept.

And while I certainly agree that the lack of formal truthiness guarantees are a major issue, and especially because the bot lacks any way to search for up-to-date information, and I’m sure there are additional heuristics that can be devised to improve practical accuracy, let’s not forget that there is no way to guarantee that any statement is true. Bots cannot solve a fundamentally impossible problem that humans are not exempt of either. And this is not the first criticism that was heard about the potential for a new technology to propagate misinformation: I remember the 1990-2000s when Internet was considered a cesspool of only amateurish-fake-conspirationist information (which partially true), and the invention of the printer was likewise criticized as it could be used to print pamphlets in huge volumes.

The TL;DR of my suggestion is that, while I certainly agree with a blanket ban in the current situation given we lack tools and hindsight, progress cannot and should not be stopped, LLMs are here to stay, future technos will be even better at mimicking human discourse, so I think that the only sustainable long-term solution is to educate humans about how to use these tools and their limitations, just like what happened historically with other new communication technologies. Devising standards about how to present such AI-generated content will certainly help in this endeavor.

1 Like

On the contrary, the following implies that your proposed feature might or might not succeed in decreasing the frequency of AI-generated answers on the forum:

Please clarify whether or not the provision of the function alluded to in the following would be offered in order to make it convenient for a user to seek an AI-generated answer in lieu of their soliciting a reply directly from a human on this forum:

Reviving this a bit, it seems as predicted LLM (specifically ChatGPT) based answers are starting to proliferate. Reviewing recent #users questions and answers over the past few days, I noticed we have one new user, @Fusion9334 , who has joined 3 days ago and has answered a number of questions in a pattern which appeared to me to be very likely to be at the very least heavily assisted by ChatGPT or a similar LLM. This conjecture was apparently confirmed by checking their profile, where the one topic they have posted is a question specifically about using ChatGPT’s Python API. In fact, as it was their first post, it seems that was in fact what brought them to this forum.

Now, to be clear, this user hasn’t done anything against the rules—however, I think their history might be a very useful set of datapoints to learn and discuss more practically about how LLMs might be used (for good or ill) in answers, and how we might address any negative impacts. Furthermore, their input here would be appreciated as well.

What follows is my personal initial impressions. It seems they’ve taken at least some care to not just jump the LLM output, and tailor it somewhat to the situation faced by each poster. Additionally, it seemed like the LLM’s wide range of background allowed it to potentially answer in much more detail (if not entirely correctly) to specialized subject matter questions about specific tools and services that the typical users here helping others were unlikely to know about.

On the other hand, on many more basic Python questions (more in scope for this forum), I did notice many instances where the other posters were trying to engage the user, typically a beginning learner and often working on an assignment, in a pedagogically motivated discourse to actually help them learn. However, before I even considered a LLM might be involved, I noticed the replies by this user were running somewhat contrary to that, directly giving the a bunch of code (that may or may not address the real problem) with relatively minimal explanation, that al least implicitly encouraged them to just copy/paste rather than actually learn something, like the other people were trying to each them. In fact, I was considering mentioning it privately to the user to consider in their approach to answering future questions, before I suspected an LLM.

After examining this real-world history, I’d like to hear others’ further thoughts, discussion points and proposals. Thanks!

3 Likes

Agreed. I wasn’t thinking LLM when I saw this thread but it definitely isn’t the sort of post that we want to be encouraging. We do NOT want people taught to copy and paste code, especially when that code has come from a language model with no concept of correctness.

2 Likes

As another update, we had to perma-ban the original user who’s LLM-generated response initially prompted this thread, as their further posts were just copy-pasting other peoples’ old (and often extremely outdated) questions from Stack Overflow with an added link to a spammy site they were evidently promoting, with moderate (possibly LLM-induced) paraphrasing to attempt to obfuscate it and no attribution. Of note, this demonstrates another form of re-using external content without attribution of which the above-proposed policy would address.

Additionally, we’ve had at least one further substantial instance of a new user employing an AI, this time in the process of writing a question asking for code review, as well as subsequent followups. Specifically, they used codepal.ai to edit the code and asked ChatGPT questions about the code/responses as well as using it to guide code generation in later replies. To note, they’d previously posted another code review thread with their own code, which developed into a long and fruitful back and forth discussion/Q&A, but this one swiftly digressed into debate about the use of LLMs and similar tools in generating, editing and reviewing code:

1 Like

My fear is that we may be seeing on this forum only the tip of a large iceberg of students who are beginning to rely on AI to generate code. If this is allowed to continue, we might wind up with a huge population of programmers who are unable to program.

In response, should we just send away learners who arrive at our doorstep with code generated by AI? Perhaps that would keep our own house clean, while allowing the problem to worsen out there in the larger world.

This may be a time in history when we should try to form an alliance with OpenAI and other such outfits in a quest to urge students to learn to program by doing the hard work of planning, writing, testing, and refining code. In the long run, this would be best for all concerned.

Surely this is a self correcting situations. Programmers by definition can program.
If the AI cannot then it is not going to work for people looking for a short cut.
No more then cut-n-paste from stackoverflow makes you a programmer.

1 Like

This quote has been floating around the Internet for a while attributed to J.A.N Lee of Virginia Polytechnic Institute:

One of von Neumann’s students at Princeton recalled that graduate students were being used to hand assemble programs into binary for their early machine. This student took time out to build an assembler, but when von Neumann found out about it he was very angry, saying that it was a waste of a valuable scientific computing instrument to use it to do clerical work.

Assuming this was John von Neumann’s position he was in a way correct, how many of can hand write binary programs to directly feed the computers anymore?

Companies and programs that can spend their resources more efficiently to solve their problems ultimately win out. LLM assisted programming might be a dead end, or in 60 years not using whatever LLM evolves in to could look like hand writing binary.

Making statements right now on what we have to force students to do really ignores the arc of history of how coding has evolved.

Personally I believe this current state of LLM theory means they won’t be great at direct code generation, but may significantly help with boilerplate code, unit tests, PR reviews, and other areas that already involve existing code that needs to be extended in some way. But my views will be washed away from this tide of history no doubt.

Yet our advice to learners today must necessarily be guided by the present state of the art and science of computing. An assembler that functioned properly in von Neumann’s lab compares favorably over a present-day LLM that is not so reliable at writing code. The question at hand today is how to respond to those who approach us with solutions that are authored by AI. Our response today may differ from what it will be in the future.

1 Like

My conclusion is to avoid using AI if I am planning to ask for Code Review :slightly_smiling_face:

To summarize my thoughts, the main reason I use AI a lot is probably because I’m afraid of giving someone a bad impression, which has happened many times. Unlike AI, which I needn’t worry about.

And I’m not afraid of AI, as long as I keep an open mind, I don’t see it as threatening.

1 Like

Seems to be a wise conclusion :slight_smile: Code review of AI-generated (or at least substantially-assisted) code would tend to give feedback about where the AI might have went wrong, rather than suggestions and pointers on how you can improve your code, which seems much more useful if you’re looking to learn and improve from it (as it seems you are).

That’s actually a quite interesting point, one I at least hadn’t previously considered. It can be genuinely quite intimidating and scary at first as a beginner to ask questions and present your code for review and critique by other human experts. And I’m certainly someone myself who can have a hard time sharing with others something that I haven’t yet mastered, ever since I was little up till the present day.

We all were newbies at some point, and we all made (and continue to make) mistakes in our code sometimes. Other folks here won’t (or at least, shouldn’t) look down on you or think badly of you just for making mistakes, the same ones we all did at some point. And while I know it can be easy for me to say this and hard for you do, don’t put yourself down over making one, either—instead, maybe treat each one as a learning opportunity to further improve your skills :slight_smile:

What earns my deep respect and admiration (and likely most others here as well) is someone who, regardless of current skill level or mistakes, always tries to be open to listening, learning, improving and asking questions. And no one here should judge you negatively as a person due to making a beginner programmer mistake. We’re here to help you along in your journey as we likewise travel on our own :blush:

And if there’s something we could do, individually or as a community, to help you feel more welcome (or addressing someone or something that’s making you feel otherwise), we’d love to hear from you—please do reach out either publicly, or privately to the mod team or me directly. Thanks!

6 Likes

You have no idea how much your message means to me, so thank you!

If you’re a personal friend of @Rosuav, please tell him or her that I’m sorry if I’ve offended or tired him or her by asking so many questions. I’m learning to be polite and ask first if it’s okay if I have more questions.

I have more respect for all of you, so thank you again.

1 Like

It’s okay, I’m not offended. I just stop responding in threads that aren’t going to be productive.

1 Like

Its not just ChatGPT that is in the running.
Open source AI is already matching it according to a leaked memo from Google: Google "We Have No Moat, And Neither Does OpenAI"

FWIW, of interest to this thread—it seems that the previous ban on LLM (ChatGPT)-generated content on the Stack Exchange network has been mostly reversed by order of SE management (perhaps in a vain attempt to partially reverse the major drop in site traffic due to ChatGPT, or due to the company’s recent investments in generative AI for its enterprise product), and moderators are now generally prohibited from removing LLM-generated content on such grounds.

The previous measures banning LLM-generated content seemed to be strongly supported by most of the platform’s active members, and the reversal has prompted huge backlash especially from the site’s volunteer moderators against new the ban on removing ChatGPT-generated posts, to the point where there is now a mass network-wide strike of SE mods for the first time I am aware of in the platform’s history, and many of the individual communities have outright refused to re-allow ChatGPT content, to the approval of most people there.

I don’t have the time to follow those links, and what you wrote is so full of double negatives that I don’t understand what’s happening. Is there still a ban or not?

@CAM-Gerlach, would it be accurate to state that the the gist of your most recent post is that on some platforms, moderators are being ordered, against their will, to allow posts to remain intact that include content generated by LLMs?

It has been particularly concerning to see a thread initiated by a new user on this forum entitled Jython isn’t executing some valid Python 2 code, and I’ve interpreted that to mean that the Python trademark is invalid, which cites legal advice from an AI in an effort to challenge Python’s trademark. If we wish to discourage the use of LLM content here, perhaps we should prepare now to notify new users upon their arrival that they should not use such content.

What official stated policy is foreseen on Python Discourse regarding the use of content generated by LLMs?

No. Stack Exchange management lifted the ban, but we don’t know details, since the instructions they gave to moderators are not public.
Volunteer moderators, feeling mistreated, are now on strike.