Building an Expert-Level Medical Deep-Research Agent with Only 32 Billion Parameters “ A practical, end-to-end guide for developers, data scientists, and clinicians who want reproducible, high-quality medical reasoning. ” 1. Why do general “deep-research” tools stumble in medicine? When ChatGPT, Gemini, or Claude first demonstrated multi-step web search, the demos looked magical. Yet the moment we moved from “Who won the 2023 Nobel Prize in Chemistry?” to “What phase-II drugs target LMNA mutations in dilated cardiomyopathy?”, accuracy plunged. System MedBrowseComp accuracy (50 questions) o3-search 19 % Gemini-2.5-Pro deep-research 25 % MedResearcher-R1-32B 27.5 % (new state-of-the-art) Two root causes surfaced: Sparse …