Dr. Nikola Popovic

Research scientist

Research interests:
My research focuses on building next-generation spatial intelligence systems that can understand and act in complex environments through multi-modal reasoning. My goal is to enable transformative advances in human-made environments through augmented assistance, robotics, and intelligent automation. To achieve this, I investigate how large-scale 3D scenes can be represented and semantically understood, and how language can be used to interact with these spaces. I am also exploring how to integrate digital twins of human-made environments to extend spatial understanding beyond static perception. Furthermore, I seek to understand how AI systems can dynamically react to changes in the state of the world by leveraging video and other temporal data streams. Finally, I am interested in developing approaches for multi-modal fusion — combining 3D maps, video feeds, digital twins, and operational documents — to achieve continuous, online spatio-temporal reasoning.

Background:
Before joining INSAIT, I have completed my PhD at the Computer Vision Lab at ETH Zurich. During my PhD studies I have worked on: improving implicit neural 3D representations of indoor scenes; model-aware 3D eye gaze tracking through weak supervisions; spatially multi-conditional image generation; compact and efficient multi-task learning. Near the end of my PhD studies, I have conducted a research scientist internship at Meta Reality Labs in Zurich, where I have worked on implicit neural representations of dynamic 3D scene. Before my PhD studies, I was a full-time teaching assistant at the University of Belgrade, School of Electrical Engineering, where I have also completed my MSc and BSc studies specializing in signal processing and control theory. I am also a long-time member of the organizing committee of the PSIML summer school on AI in Serbia.

2026

Yue Li, Qi Ma, Runyi Yang, Mengjiao Ma, Bin Ren, Nikola Popovic, Nicu Sebe, Theo Gevers, Luc Van Gool, Danda Pani Paudel, Martin R. Oswald
Chorus: Multi-Teacher Pretraining for Holistic 3D Gaussian Scene Encoding
The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)

2025

M. Eren Akbiyik, Nedko Savov, Danda Pani Paudel, Nikola Popovic, Christian Vater, Otmar Hilliges, Luc Van Gool, Xi Wang
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction
In: The International Conference on Learning Representations (ICLR 2025)

Yue Li, Qi Ma, Runyi Yang, Huapeng Li, Mengjiao Ma, Bin Ren, Nikola Popovic, Nicu Sebe, Ender Konukoglu, Theo Gevers, Luc Van Gool, Martin R. Oswald, Danda Pani Paudel
SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
In: International Conference on Computer Vision (ICCV 2025)

Mengjiao Ma, Qi Ma, Yue Li, Jiahuan Cheng, Runyi Yang, Bin Ren, Nikola Popovic, Mingqiang Wei, Nicu Sebe, Ender Konukoglu, Luc Van Gool, Theo Gevers, Martin R. Oswald, Danda Pani Paudel
GaussianWorld: A Large Dataset and Comprehensive Benchmark for Language Gaussian Splatting
In: NeurIPS 2025 (Dataset and benchmark track)

2023

Nikola Popovic, Dimitrios Christodoulou, Danda Pani Paudel, Xi Wang, Luc Van Gool
Model-aware 3D Eye Gaze from Weak and Few-shot Supervisions
In: the International Symposium on Mixed and Augmented Reality (ISMAR 2023)

Nikola Popovic, Danda Pani Paudel, Thomas Probst, Luc Van Gool
Token-Consistent Dropout For Calibrated Vision Transformers
In International Conference on Image Processing (ICIP 2023)

Nikola Popovic, Danda Pani Paudel, Luc Van Gool
Surface Normal Clustering for Implicit Representation of Manhattan Scenes
In: International Conference on Computer Vision (ICCV 2023)