Thinking with Map: How AI Achieves Human-Like Image Geolocation

7 hours ago 高效码农

Thinking with Map: How AI Learned to “Think” Like Humans Using Maps for Precise Image Geolocalization ### Quick Summary (Featured Snippet Ready) Thinking with Map is an advanced agentic framework that enables large vision-language models (LVLM) to perform image geolocalization by actively querying maps — just like humans do. Built on Qwen3-VL-30B-A3B, it combines reinforcement learning and parallel test-time scaling to dramatically boost accuracy. On the new MAPBench (China-focused, up-to-date street-view benchmark), it achieves 44.98% Acc@500m on easy cases and 14.86% on hard cases — significantly outperforming Gemini-3-Pro with Google Search/Map (20.86% → 4.02% on the same splits) and other …