AgentSociety Framework: Simulating 30,000 AI Residents in Realistic Beijing Environment

高效码农

5 months ago

Recreating a Day in Beijing with 30,000 Digital Residents: How the AgentSociety Framework Gives Large Language Models a Real City to Live In

❝

Keywords: large-scale LLM agents, social simulation, parallel computing, realistic urban environment, Beijing mobility, AgentSociety framework

❞

Introduction: Why Give AI a Commute?

Imagine tomorrow morning Beijing’s rush hour is no longer made of flesh-and-blood commuters but of 30,000 「AI agents」—each deciding when to leave home, which metro line to take, and whether to grab coffee on the way. Could this digital city move in lockstep with the real one?
Researchers from Tsinghua University and The Hong Kong University of Science and Technology (Guangzhou) say 「yes」—and they built 「AgentSociety」 to prove it. Below you will find a plain-language tour of what the system does, how it works, and how you can run a miniature Beijing on your own hardware.

1. The Core Idea in One Table

Traditional Approach	Pain Point	AgentSociety Fix
Hand-coded “people”	Behaviors feel robotic, fail at scale	Let LLMs learn behaviors from city feedback
Game-like maps	No real roads, shops, or rush-hour delays	Plug in OpenStreetMap + SafeGraph POIs
Simulations crash at 100 agents	Port exhaustion, memory blow-ups	24 × A800 GPUs, linear scaling to 30 k agents

2. Three Lego Bricks That Build an Entire City

AgentSociety is best pictured as three 「interlocking spaces」 plus one 「parallel engine」.

「Urban Space」 – where agents move
「Social Space」 – where agents talk
「Economic Space」 – where agents earn and spend

A 「parallel interaction engine」 glues them together.

2.1 Urban Space: Teaching AI to Walk the City

「Map Data」
- Roads & zones: OpenStreetMap
- Shops & offices: SafeGraph
「Travel Modes」
- Car, walk, bus, taxi—updated every simulated second
「Route Planner」
- Written in Go; agents get real-time travel times exactly like Google Maps

「Mini-example」
Digital resident Alice wants to reach Wangjing SOHO by 8:30 a.m.
The simulator replies:

❝

Walk 4 min → Subway Line 15 → Exit C → Walk 6 min = 42 min total.

❞

2.2 Social Space: Text Messages & Friend Lists

Feature	What It Means for an Agent
「Offline socializing」	If two agents share the same café, they can start a chat
「Online messaging」	Any two agents can DM; messages are filtered by an in-built “moderator”
「Dynamic ties」	Friendship strength goes up or down based on every interaction

2.3 Economic Space: Salaries, Taxes, GDP—All in Code

Role in Simulation	What It Can Do
Person	Apply for jobs, receive wages, pay taxes, buy goods
Firm	Post vacancies, pay salaries, lay off workers
Bank	Pay interest on deposits
Government	Adjust tax rates, collect macro data
National Bureau of Statistics (NBS)	Publish GDP, average working hours

All money flows run through a 「ledger-style Go simulator」 that handles interest, taxes, and macro indicators automatically.

3. Parallel Engine: How 30,000 Agents Run Faster Than Real Time

3.1 The Bottleneck

「Single-process designs」 run out of TCP ports and RAM after ~1 k agents.
「Random LLM latency」 makes each simulation step finish at unpredictable times.

3.2 Three-Step Fix

Step	Technique	Benefit
「Group-based execution」	Split agents into groups; each group is one Ray actor sharing one set of service clients	Ports drop from 10 k → 100
「Time alignment」	Force every LLM call round to equal 300 simulated seconds	Experiments are reproducible
「Message bus」	Redis Pub/Sub lets any agent talk to any other—or to a human interviewer	Future-proof for new interaction modes

4. Speed Test: Real Numbers on a Huawei Cloud Machine

「Test rig」

1 × c7.16xlarge.4 Huawei Cloud instance
24 × NVIDIA A800 GPUs via vLLM 0.8.1
Model: Qwen2.5-7B-Instruct

Agents	Groups	Seconds per Round	Faster than Reality?
1 000	4	13.2	23×
3 000	8	28.9	10×
10 000	8	81.5	4.4×
30 000	8	251.9	「1.2×」

❝

Bottom line: If you can pay for the GPUs, performance scales almost linearly.

❞

5. Does the Real Map Really Matter? A/B Test Results

We ran the same agents in two modes:

「W-Env」: with true roads, POIs, and travel times
「WO-Env」: agents guess distances and shop names from LLM memory

Metric	Meaning	W-Env	WO-Env
Radius	How far an agent roams (smaller = more realistic)	「0.023」	0.427
Dayloc	Unique places visited per day	「0.038」	0.129
itdError	Intention-to-behavior error (lower = better)	「0.094」	0.241

「Take-away」: strip away the real city data and agents start behaving like lost tourists.

6. Quick-Start Guide: Spin Up a Mini Beijing in 10 Commands

❝

All commands are copied from the official repository; no extra software is required.

❞

Clone the code

git clone https://github.com/tsinghua-fib-lab/AgentSociety
cd AgentSociety

Download the Beijing map

python tools/download_map.py --city beijing --bbox 116.2,39.7,117.0,40.3

Generate 1 000 residents

python scripts/generate_agents.py --num 1000 --config profiles/urban_commuter.json

Launch the Go-based environment server
```
go run cmd/env/main.go --port 50051
```
Start Ray head node
```
ray start --head
```

Run the simulation

python scripts/run_simulation.py \
  --agents 1000 \
  --groups 4 \
  --llm-endpoint http://localhost:8000/v1

Watch the live dashboard

python -m gui.dashboard \
  --db postgresql://user:pwd@localhost/agent_society

7. Frequently Asked Questions

「Q1: Can I go beyond 30 000 agents?」
A: Yes. The authors report linear scaling; you simply add more GPUs.

「Q2: Does it work for cities other than Beijing?」
A: Yes. Replace the bounding box in step 2 with any city covered by OpenStreetMap.

「Q3: How do I add hospitals, schools, or stadiums?」
A: Supply additional POI files in the same format as SafeGraph; the economic space already exposes interfaces for new roles.

「Q4: I’m not a Go developer. Can I stay in Python?」
A: Absolutely. The environment exposes gRPC; use env_client.py from any Python script.

「Q5: What does it cost to run 30 k agents for a day?」
A: Roughly 8 GPU-hours on the tested setup. Cloud spot pricing makes this affordable for many labs.

「Q6: Is personal data leaked?」
A: No. All map and POI data are public; agent profiles are synthetic.

8. Known Limitations & Next Steps

Current Gap	Possible Fix
Company decisions are still scripted	Plug in larger LLMs for supply–demand games
No market-price mechanism	Add order-book simulation
GPU cost high	Try quantization, speculative decoding, or smaller models

9. Why This Matters to Practitioners

AgentSociety bundles three achievements:

「Real map」: OpenStreetMap + SafeGraph = turn-by-turn realism.
「Real economy」: wages, taxes, GDP—micro and macro link.
「Real speed」: 30 k agents, faster than the wall clock.

Urban planners, economists, or social scientists can now 「write prompts, not code」, to test policies or explore “what-if” scenarios—something surveys and traditional ABM tools never achieved at this fidelity.

10. Citation & Further Reading

If you use this work, please cite:

@inproceedings{zhang2025agentsociety,
  title={A Parallelized Framework for Simulating Large-Scale LLM Agents with Realistic Environments and Interactions},
  author={Zhang, Jun and Yan, Yuwei and Yan, Junbo and Zheng, Zhiheng and Piao, Jinghua and Jin, Depeng and Li, Yong},
  booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 6: Industry Track)},
  pages={1339--1349},
  year={2025}
}