
New Benchmark “Physics-IQ” Challenges AI Video Models’ Understanding of the Physical World

A new benchmark, Physics-IQ, developed jointly by INSAIT and Google DeepMind under the leadership of Saman Motamed, PhD student at INSAIT, has attracted wide attention across the global AI research community following its presentation at ICCV 2025.
The study represents a major step toward assessing and advancing the physical reasoning capabilities of current generative video models. Physics-IQ provides a comprehensive benchmark of 396 real-world videos encompassing a wide range of physical phenomena — from fluid dynamics to solid mechanics — designed to test AI systems’ ability to predict motion and interactions beyond visual appearance.
The findings reveal a significant gap between perception and understanding: even state-of-the-art models such as Sora, Runway, and VideoPoet can generate visually compelling videos but fail to accurately capture underlying physical dynamics.
The research has generated strong interest within the academic community, emphasizing the need for integrating interactive and experiential learning into next-generation video AI systems.
The open-source dataset, evaluation code, and results are publicly available.
- Project page: https://physics-iq.github.io/
- Code repository: https://github.com/google-deepmind/physics-IQ-benchmark