Artificial intelligence is not just flooding social media with garbage, it’s also apparently afflicting the open-source programming community. And in the same way, fact-checking tools like X’s Community Notes struggle to refute a deluge of false information, contributors to open-source projects are lamenting the time wasted evaluating and debunking bug reports created using AI code-generation tools.
The Register reported today on such concerns raised by Seth Larson in a blog post recently. Larson is a security developer-in-residence at the Python Software Foundation who says that he has noticed an uptick in “extremely low-quality, spammy, and LLM-hallucinated security reports to open source projects.”
“These reports appear at first glance to be potentially legitimate and thus require time to refute,” Larson added. It could potentially be a big problem for open-source projects (i.e. Python, WordPress, Android) that power much of the internet, because they’re often maintained by small groups of unpaid contributors. Legitimate bugs in ubiquitous code libraries can be dangerous because they have such a potentially wide impact zone if exploited. Larson said he’s only seeing a relatively small number of AI-generated junk reports, but the number is increasing.
Another developer, Daniel Sternberg, called out a bug submitter for wasting his time with a report he believed was generated using AI:
You submitted what seems to be an obvious AI slop ‘report’ where you say there is a security problem, probably because an AI tricked you into believing this. You then waste our time by not telling us that an AI did this for you and you then continue the discussion with even more crap responses – seemingly also generated by AI.
Code generation is an increasingly popular use case for large language models, though many developers are still torn on how useful they truly are. Programs like GitHub Copilot or ChatGPT’s own code generator can be quite effective at producing scaffolding, the basic skeleton code to get any project started. They can also be useful for finding functions in a programming library a developer might not be intimate with.
But as with any language model, they will hallucinate and produce incorrect code. Code generators are probability tools that guess what you want to write next based on the code you have given them and what they have seen before. Developers still need to fundamentally understand the programming language they’re working with and know what they’re trying to build; the same way essays written by ChatGPT need to be reviewed and modified manually.
Platforms like HackerOne offer bounties for successful bug reports, which may encourage some individuals to ask ChatGPT to search a codebase for flaws and then submit erroneous ones the LLM returns.
Spam has always been around on the internet, but AI is making it a lot easier to generate. It seems possible that we’re going to find ourselves in a situation that demands more technology like CAPTCHAs for login screens are used to combat this. An unfortunate situation and a big waste of time for everyone.