Social media has long battled bot overload — Now AI is both the problem and the cure

Social media bots vector illustration. Fake accounts in tiny person concept

It's never been harder to tell what's real and what's artificial online. (Image by VectorMine on Shutterstock)

SEATTLE — Remember when the biggest threat online was a computer virus? Those were simpler times. Today, we face a far more insidious digital danger: AI-powered social media bots. A study by researchers from the University of Washington and Xi’an Jiaotong University reveals both the immense potential and concerning risks of using large language models (LLMs) like ChatGPT in the detection and creation of these deceptive fake profiles.

Social media bots — automated accounts that can mimic human behavior — have long been a thorn in the side of platform operators and users alike. These artificial accounts can spread misinformation, interfere with elections, and even promote extremist ideologies. Until now, the fight against bots has been a constant game of cat and mouse, with researchers developing increasingly sophisticated detection methods, only for bot creators to find new ways to evade them.

Enter the era of large language models. These AI marvels, capable of understanding and generating human-like text, have shown promise in various fields. But could they be the secret weapon in the war against social media bots? Or might they instead become a powerful tool for creating even more convincing fake accounts?

The research team, led by Shangbin Feng, set out to answer these questions by putting LLMs to the test in both bot detection and bot creation scenarios. Their findings paint a picture of both hope and caution for the future of social media integrity.

“There’s always been an arms race between bot operators and the researchers trying to stop them,” says Feng, a doctoral student in Washington’s Paul G. Allen School of Computer Science & Engineering, in a university release. “Each advance in bot detection is often met with an advance in bot sophistication, so we explored the opportunities and the risks that large language models present in this arms race.”

On the detection front, the news is encouraging. The researchers developed a novel approach using LLMs to analyze various aspects of user accounts, including metadata (like follower counts and account age), the text of posts, and the network of connections between users. By combining these different streams of information, their LLM-based system was able to outperform existing bot detection methods by an impressive margin—up to 9.1% better on standard datasets.

ChatGPT on smartphone — Large language models like ChatGPT can play a major role in the detection and creation of deceptive fake profiles, researchers warn. (Photo by Tada Images on Shutterstock)

What’s particularly exciting about this approach is its efficiency. While traditional bot detection models require extensive training on large datasets of labeled accounts, the LLM-based method achieved its superior results after being fine-tuned on just 1,000 examples. This could be a game-changer in a field where high-quality, annotated data is often scarce and expensive to obtain.

However, the study’s findings weren’t all rosy. The researchers also explored how LLMs might be used by those on the other side of the battle — the bot creators themselves. By leveraging the language generation capabilities of these AI models, they were able to develop strategies for manipulating bot accounts to evade detection.

These LLM-guided evasion tactics proved alarmingly effective. When applied to known bot accounts, they were able to reduce the detection rate of existing bot-hunting algorithms by up to 29.6%. The manipulations ranged from subtle rewrites of bot-generated text to make it appear more human-like to strategic changes in which accounts a bot follows or unfollows.

Perhaps most concerning is the potential for LLMs to create bots that are not just evasive but truly convincing. The study demonstrated that LLMs could generate user profiles and posts that capture nuanced human behaviors, making them far more difficult to distinguish from genuine accounts.

This dual-use potential of LLMs in the realm of social media integrity presents a challenge for platform operators, researchers, and policymakers alike. On one hand, these powerful AI tools could revolutionize our ability to identify and remove malicious bot accounts at scale. On the other, they risk becoming a sophisticated weapon in the arsenal of those seeking to manipulate online discourse.

“Analyzing whether a user is a bot or not is much more complex than some of the tasks we’ve seen these general LLMs excel at, like recalling a fact or doing a grade-school math problem,” says Feng.

The implications extend far beyond just cleaning up our social media feeds. In an age where online information can sway elections, shape public opinion on critical issues, and even influence global events, the stakes of this technological arms race are immense.

As we stand at this crossroads, the researchers emphasize the need for continued innovation in bot detection methods, particularly those that can keep pace with LLM-enhanced evasion tactics. They also call for increased transparency from social media platforms and a collaborative approach between researchers, tech companies, and policymakers to address these emerging challenges.

“This work is only a scientific prototype,” notes senior author Yulia Tsvetkov, an associate professor in the Allen School. “We aren’t releasing these systems as tools anyone can download, because in addition to developing technology to defend against malicious bots, we are experimenting with threat modeling of how to create an evasive bot, which continues the cat-and-mouse game of building stronger bots that need stronger detectors.”

The bot battle is far from over. But with large language models entering the fray, we may be witnessing a pivotal moment in the fight for social media integrity. As these AI technologies continue to advance, their role in shaping our online world — for better or worse — is only likely to grow.

Paper Summary

Methodology

The researchers developed a framework they call “mixture-of-heterogeneous-experts” to analyze different aspects of social media accounts using large language models (LLMs). They tested three LLMs: Mistral-7B, LLaMA2-70B, and ChatGPT. For each account, they analyzed metadata (like follower counts), the text of posts, and the network of connections.

They then combined these analyses to make a final prediction about whether an account was a bot or not. To test the effectiveness of their approach, they used two widely used datasets of Twitter accounts: Twibot-20 and Twibot-22. On the bot creation side, they explored various ways LLMs could be used to modify bot accounts, including rewriting posts and strategically changing which accounts a bot follows.

Key Results

The LLM-based bot detection method outperformed existing approaches, achieving up to a 9.1% improvement in the F1-score (a measure of accuracy) on both datasets. Notably, this was achieved after fine-tuning the LLM on just 1,000 examples, compared to the thousands or even millions of examples often needed for traditional machine learning approaches.

On the bot creation side, LLM-guided manipulation strategies were able to reduce the effectiveness of existing bot detectors by up to 29.6%. The study also found that larger, more advanced LLMs (like ChatGPT) generally performed better at both detection and evasion tasks than smaller models.

Study Limitations

The study primarily focused on X (Twitter) data, which may not fully represent the diversity of social media platforms. The researchers also note that they were unable to test their methods on very recent bot accounts due to limitations in data availability.

Additionally, while the study demonstrated the potential of LLMs in this domain, it did not exhaustively explore all possible ways these models could be used for bot detection or creation.

Discussion & Takeaways

The study highlights the double-edged nature of LLMs in the context of social media integrity. While these models show great promise for improving bot detection, they also present new challenges by enabling the creation of more sophisticated bots.

The researchers emphasize the need for ongoing innovation in detection methods and stress the importance of considering the ethical implications of these technologies. They also note that LLM-based approaches could be particularly valuable in scenarios where large amounts of labeled training data are not available.

Funding & Disclosures

The study was supported by the National Science Foundation under two grants (IIS2142739 and IIS2203097) and by DARPA. Additional support came from an Alfred P. Sloan Foundation Fellowship. The researchers declared no conflicts of interest related to this work.