Researchers Explore Replacing Surveys Using Social Simulation With AI Agents

2 hours ago 3

Published on April 28, 2025
In AI News

What if LLMs can simulate public opinions, reactions, and more without requiring a survey?

Photo by Darlene Alderson / Pexels.com

Researchers from Fudan University have developed a framework, dubbed SocioVerse, which is an LLM-agent-driven world model for social simulation.

The framework includes four components and a user pool of 10 million real individuals. The Social Environment component feeds updated external information into the simulation. The User Engine and Scenario Engine components provide realistic user context and align the simulation with the real world, respectively. Moreover, the Behavior Engine makes the agents reproduce human behaviours.

This framework aims to replace traditional methods, such as surveys, interviews, and observations, which present several challenges, including high costs, limited sample sizes, and ethical concerns.

With autonomous AI agents simulating human behaviour, researchers aim to observe the patterns of impact from micro-level decisions and forecast potential social dynamics.

To test the framework, researchers conducted large-scale simulation experiments across domains, including politics, news, and economics. Models like Llama-3-70b-Instruct, Qwen2.5-72b-Instruct, DeepSeek-V3, GPT-4o mini, GPT-4o, and DeepSeek-R1-671b were used for the tests.

First, the LLM agents focused on predicting state-level results in the US presidential election. It was found that the GPT-4o-mini and Qwen2.5-72b exhibit competitive performance according to the evaluation metric, with over 90% of state voting results being predicted correctly. DeepSeek-R1-671b was observed to be overthinking, resulting in less accurate results.

Second, a simulation was conducted to assess the public’s reaction to breaking news, using the example of the ChatGPT release. GPT-4o and Qwen2.5-72b were observed to be more aligned with real-world perspectives compared to other models.

Lastly, in a simulated test to understand behaviours from a national economic survey of China, Llama3-70b proved to be superior over other models in the survey, where it was able to accurately reproduce the spending habits of individuals.

“Our findings indicate that state-of-the-art LLMs demonstrate a notable ability to simulate human responses in complex social contexts, although some gaps still remain between the simulated response and observed real-world outcomes,” the research paper stated.

Moreover, the researchers aim to explore a broader range of scenarios to expand the simulation capabilities of LLMs.