Mariya Hendriksen

I am an intern on the Game Intelligence team at Microsoft Research Cambridge. I completed my PhD at the University of Amsterdam where I worked on multimodal machine learning for information retrieval under the supervision of Maarten de Rijke and Paul Groth.

I hold a Master's degree in Artificial Intelligence from KU Leuven and a Bachelor's degree in Computational Linguistics from Novosibirsk State University. Throughout my academic journey, I've interned at several labs, including the Gemini team at Google, Bloomberg AI, Amazon Alexa, LIIR at KU Leuven, and ETH Zurich.

Alongside my research, I am committed to fostering diverse and inclusive research communities. As such I serve as the General Chair for the WiML at ICML 2025, and mentor through the Inclusive AI initiative.

Email / Scholar / GitHub / Bsky / X

News

Publications

(Apr 2025) Paper 'Benchmark Granularity and Model Robustness for Image-Text Retrieval' accepted at SIGIR 2025.
(Jul 2024) Paper 'Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning' accepted at TMLR.
(Dec 2023) Paper 'Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control' accepted at ECIR 2024.
(Dec 2022) Our paper 'Scene-centric vs. Object-centric Image-Text Cross-modal Retrieval: A Reproducibility Study' accepted at ECIR 2023.

Milestones & Activities

(Apr 2025) Serving as the General Chair for the Women in Machine Learning symposium at ICML 2025.
(Sep 2024) Started a research internship at Microsoft Research Cambridge on the Game Intelligence team.
(Apr 2024) Started a research internship at Google Zurich on the Gemini team.
(Jul 2023) Started a research internship at Bloomberg AI in London on the Question Answering team.

Research

	Adapting Vision-Language Models for Evaluating World Models. Mariya Hendriksen, Tabish Rashid, David Bignell, Raluca Georgescu, Abdelhak Lemkhenter, Katja Hoffman, Sam Devlin, Sarah Parisot Under Submission, 2025 We address the challenge of automated evaluation for world model rollouts by introducing a structured protocol and UNIVERSE, a method for adapting vision-language models through unified fine-tuning to assess temporal and semantic fidelity.
	Demonstrating and Reducing Shortcuts in Vision-Language Representation Learning. Maurits Bleeker, Mariya Hendriksen, Andrew Yates, Maarten de Rijke (co-first author) TMLR, 2024 arXiv / bibtex / Github We propose a framework to examine the shortcut learning problem in the context of Vision-Language contrastive representation learning with multiple captions per image. We show how this problem can be partially mitigated using a form of text reconstruction and implicit feature modification.
	Multimodal Learned Sparse Retrieval with Probabilistic Expansion Control. Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijke (co-first author) ECIR, 2024 arXiv / Github We propose a framework for multimodal learned sparse retrieval.
	Scene-Centric vs. Object-Centric Image-Text Cross-Modal Retrieval: A Reproducibility Study. Mariya Hendriksen, Svitlana Vakulenko, Ernst Kuiper, Maarten de Rijke ECIR, 2023 arXiv / Github
	Extending CLIP for Category-to-Image Retrieval in E-commerce . Mariya Hendriksen, Maurits Bleeker, Svitlana Vakulenko, Nanne van Noord, Maarten de Rijke ECIR, 2022 arXiv / Github
	Analyzing and Predicting Purchase Intent in E-commerce: Anonymous vs. Identified Customers . Mariya Hendriksen, Ernst Kuiper, Pim Nauts, Sebastian Schelter, Maarten de Rijke SIGIR eCom, 2020 arXiv

Build upon Jon Barron's template.