MMGR Benchmark Test: Why Your AI Video Generator Fails Sudoku and Walks Through Walls

5 days ago 高效码农

What MMGR Really Tests: A Plain-English Walk-Through of the Multi-Modal Generative Reasoning Benchmark > If you just want the takeaway, scroll to the “Sixty-Second Summary” at the end. > If you want to know why your shiny text-to-video model still walks through walls or fills Sudoku grids with nine 9s in the same row, read on. 1. Why another benchmark? Existing video scores such as FVD (Fréchet Video Distance) or IS (Inception Score) only ask one question: “Does the clip look realistic to a frozen image classifier?” They ignore three bigger questions: Is the motion physically possible? Does the scene …