ASearcher: How Asynchronous Reinforcement Learning Breaks 10-Click Barrier in Open-Source Search Agents

9 hours ago 高效码农

Going Beyond Ten Clicks: How ASearcher Uses Asynchronous Reinforcement Learning to Push Open-Source Search Agents Past 40 Turns Imagine you are asked to find the exact number of gold, silver, and bronze medals China won in the 2012 London Olympics as of 31 December 2024. A quick search returns two conflicting totals: “38-27-22” and “39-31-22”. A human researcher would open multiple official reports, cross-check doping appeals, and finally discover that one gold medal was later withdrawn. That process can take dozens of web pages and many reasoning steps—far more than the ten-turn limit that most open-source language agents accept today. …