Image

I am a research scientist interested in understanding the fundamental properties of deep learning systems in order to make them more reliable, robust, efficient and broadly beneficial for society. Currently, I am at Google DeepMind working on multimedia provenance and watermarking as part of the SynthID effort.

Prior to that, I completed my PhD with the Autonomous Intelligent Machines and Systems CDT at the University of Oxford where I was supervised by Philip Torr and Adel Bibi. I was also a research intern at Motional, Adobe, DeepMind and Meta. My thesis established the universal in-context approximation capabilities of sequence models.

Before coming to Oxford, I got my MSc at ETH Zürich focusing on robotics, machine learning, statistics, and applied category theory. My thesis was on Compositional Computational Systems. At ETH, I was working closely with Prof. Emilio Frazzoli's group and my studies were generously funded by the Excellence Scholarship & Opportunity Programme (ESOP).



Updates

October 2025
New paper from my internship at Meta, where we show that we should be able to construct much better watermarking methods. In the process we trained ChunkySeal: the largest image watermarking model to date!
October 2025
Our paper studying the coexistence of multiple watermarks in the same image got accepted at NeurIPS 2025!
August 2025
Very excited to be joining the SynthID team at Google DeepMind as a Research Scientist!
May 2025
I joined FAIR as a research intern working on watermarking.
April 2025
New paper from my internship at Google DeepMind: we diagnosed why gisting, a popular method for in-context compression, does not work well and found several fixes that make it much better!
February 2025
Our paper on mitigating fine-tuning risks led by Francisco Eiras was accepted at ICLR 2025!
January 2025
As part of my internship at Adobe, we found that multiple watermarks can coexist in the same image and hence watermarking methods can be ensembled! Details in our new paper.
October 2024
I am starting an internship at Google DeepMind as part of the Efficient Intelligence team.
September 2024
Our paper on universal in-context approximation with fully recurrent models was accepted at NeurIPS!
June 2024
A new paper on how fully recurrent models can also be universal in-context approximators! We also developed a programming language for recurrent models as part of it.
May 2024
I am starting an internship at Adobe working on watermarking as part of their Content Authenticity Initiative.
May 2024
Our paper on universal approximation via prompting pretrained transformers got accepted to ICML 2024! We also had our position paper on the Risks and Opportunities of Open-Source Generative AI accepted with an Oral presentation!
February 2024
In a new paper, we show that prompting a pretrained transformer can be sufficient for universal approximation!
January 2024
Our When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations paper got accepted at ICLR 2024 and received the the Entropic Award for most surprising negative result at the I Can't Believe It's Not Better workshop at NeurIPS 2023!
October 2023
We have a new paper on the structural limitations of prompting, in-context learning and prefix-tuning!
October 2023
The Alan Turing Institute’s response to the House of Lords Large Language Models Call for Evidence is now public. I contributed extensively on the questions regarding capabilities, trends, risks and opportunities of LLMs.
September 2023
Our paper on multilingual tokenization fairness got accepted to NeurIPS 2023!
May 2023
We have a new preprint on tokenization across different languages! Alongside, we are also providing a webpage where you can check how your favorite languages fare compared to English.
April 2023
A new paper accepted at ICML 2023! In it, we propose a generalization of Lipschitz continuity that results in tighter certificates and use them to study what happens when ensembling robust classifiers.
October 2022
Happy to announce we have a new preprint out: Robustness of Unsupervised Representation Learning without Labels.
July 2022
Our paper on efficient safety boundary detection for expensive simulators was accepted at IROS 2022. I had the opportunity to collaborate on it with a great team while at Motional.
October 2021
May 2021
I am happy to share that I am starting an internship at Motional in Singapore! I will be working on detecting the safe operational envelope of the Motional self-driving stack.
October 2020
I presented the final results of my Master Thesis! The title is Compositional Computational Systems and it deals with the problem of problem-solving. You can find the presentation and the thesis itself under Publications.
July 2020
Our paper, Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents, a collaboration between ETH Zürich, Université de Montréal, and Toyota Technological Institute at Chicago, got accepted at IROS2020.
March 2020
I started my Master thesis under the supervision of Gioele Zardini and Dr. Andrea Censi in Prof. Emilio Frazzoli's group, at the Institute for Dynamic Systems and Control, ETH Zürich. We are working on a Compositional Computational Theory in the context of Applied Category Theory.
February 2020
I concluded my semester project with Daedalean under the supervision of Prof. Thomas Hofmann. We worked on machine learning models which utilize additional inputs at training time, showed how they can be formulated as an optimization problem over mutual information terms, and explored their implications to model certification.
January 2020
Our paper, Learning Camera Miscalibration Detection on which I worked at the Autonomous Systems Lab got accepted at ICRA2020.

Publications


Full list on Google Scholar.

Image

We Can Hide More Bits: The Unused Watermarking Capacity in Theory and in Practice

Aleksandar Petrov, Pierre Fernandez, Tomáš Souček, Hady Elsahar

[Abstract] [arXiv]

Despite rapid progress in deep learning-based image watermarking, the capacity of current robust methods remains limited to the scale of only a few hundred bits. Such plateauing progress raises the question: How far are we from the fundamental limits of image watermarking? To this end, we present an analysis that establishes upper bounds on the message-carrying capacity of images under PSNR and linear robustness constraints. Our results indicate theoretical capacities are orders of magnitude larger than what current models achieve. Our experiments show this gap between theoretical and empirical performance persists, even in minimal, easily analysable setups. This suggests a fundamental problem. As proof that larger capacities are indeed possible, we train ChunkySeal, a scaled-up version of VideoSeal, which increases capacity 4 times to 1024 bits, all while preserving image quality and robustness. These findings demonstrate modern methods have not yet saturated watermarking capacity, and that significant opportunities for architectural innovation and training strategies remain.

Image

Long Context In-Context Compression by Getting to the Gist of Gisting

Aleksandar Petrov, Mark Sandler, Andrey Zhmoginov, Nolan Miller, Max Vladymyrov

[Abstract] [arXiv]

Long context processing is critical for the adoption of LLMs, but existing methods often introduce architectural complexity that hinders their practical adoption. Gisting, an in-context compression method with no architectural modification to the decoder transformer, is a promising approach due to its simplicity and compatibility with existing frameworks. While effective for short instructions, we demonstrate that gisting struggles with longer contexts, with significant performance drops even at minimal compression rates. Surprisingly, a simple average pooling baseline consistently outperforms gisting. We analyze the limitations of gisting, including information flow interruptions, capacity limitations and the inability to restrict its attention to subsets of the context. Motivated by theoretical insights into the performance gap between gisting and average pooling, and supported by extensive experimentation, we propose GistPool, a new in-context compression method. GistPool preserves the simplicity of gisting, while significantly boosting its performance on long context compression tasks.

Image

On the Coexistence and Ensembling of Watermarks

Aleksandar Petrov, Shruti Agarwal, Philip H.S. Torr, Adel Bibi, John Collomosse

Conference on Neural Information Processing Systems (NeurIPS) 2025

[Abstract] [arXiv]

Watermarking, the practice of embedding imperceptible information into media such as images, videos, audio, and text, is essential for intellectual property protection, content provenance and attribution. The growing complexity of digital ecosystems necessitates watermarks for different uses to be embedded in the same media. However, to detect and decode all watermarks, they need to coexist well with one another. We perform the first study of coexistence of deep image watermarking methods and, contrary to intuition, we find that various open-source watermarks can coexist with only minor impacts on image quality and decoding robustness. The coexistence of watermarks also opens the avenue for ensembling watermarking methods. We show how ensembling can increase the overall message capacity and enable new trade-offs between capacity, accuracy, robustness and image quality, without needing to retrain the base models.

Image

Universal In-Context Approximation By Prompting Fully Recurrent Models

Aleksandar Petrov, Tom A. Lamb, Alasdair Paren, Philip H.S. Torr, Adel Bibi

Conference on Neural Information Processing Systems (NeurIPS) 2024

[Abstract] [arXiv] [Code: LSRL compiler]

Zero-shot and in-context learning enable solving tasks without model fine-tuning, making them essential for developing generative model solutions. Therefore, it is crucial to understand whether a pretrained model can be prompted to approximate any function, i.e., whether it is a universal in-context approximator. While it was recently shown that transformer models do possess this property, these results rely on their attention mechanism. Hence, these findings do not apply to fully recurrent architectures like RNNs, LSTMs, and the increasingly popular SSMs. We demonstrate that RNNs, LSTMs, GRUs, Linear RNNs, and linear gated architectures such as Mamba and Hawk/Griffin can also serve as universal in-context approximators. To streamline our argument, we introduce a programming language called LSRL that compiles to these fully recurrent architectures. LSRL may be of independent interest for further studies of fully recurrent models, such as constructing interpretability benchmarks. We also study the role of multiplicative gating and observe that architectures incorporating such gating (e.g., LSTMs, GRUs, Hawk/Griffin) can implement certain operations more stably, making them more viable candidates for practical in-context universal approximation.

Image

Risks and Opportunities of Open-Source Generative AI

Francisco Eiras, Aleksandar Petrov, Bertie Vidgen, Christian Schroeder, Fabio Pizzati, Katherine Elkins, Supratik Mukhopadhyay, Adel Bibi, Aaron Purewal, Csaba Botos, Fabro Steibel, Fazel Keshtkar, Fazl Barez, Genevieve Smith, Gianluca Guadagni, Jon Chun, Jordi Cabot, Joseph Imperial, Juan Arturo Nolazco, Lori Landay, Matthew Jackson, Phillip H.S. Torr, Trevor Darrell, Yong Lee, Jakob Foerster

International Conference on Machine Learning (ICML) 2024

[Abstract] [arXiv]

Applications of Generative AI (Gen AI) are expected to revolutionize a number of different areas, ranging from science & medicine to education. The potential for these seismic changes has triggered a lively debate about the potential risks of the technology, and resulted in calls for tighter regulation, in particular from some of the major tech companies who are leading in AI development. This regulation is likely to put at risk the budding field of open-source generative AI. Using a three-stage framework for Gen AI development (near, mid and long-term), we analyze the risks and opportunities of open-source generative AI models with similar capabilities to the ones currently available (near to mid-term) and with greater capabilities (long-term). We argue that, overall, the benefits of open-source Gen AI outweigh its risks. As such, we encourage the open sourcing of models, training and evaluation data, and provide a set of recommendations and best practices for managing risks associated with open-source generative AI.

Image

Prompting a Pretrained Transformer Can Be a Universal Approximator

Aleksandar Petrov, Philip H.S. Torr, Adel Bibi

International Conference on Machine Learning (ICML) 2024

[Abstract] [arXiv]

Despite the widespread adoption of prompting, prompt tuning and prefix-tuning of transformer models, our theoretical understanding of these fine-tuning methods remains limited. A key question is whether one can arbitrarily modify the behavior of pretrained model by prompting or prefix-tuning it. Formally, whether prompting and prefix-tuning a pretrained model can universally approximate sequence-to-sequence functions. This paper answers in the affirmative and demonstrates that much smaller pretrained models than previously thought can be universal approximators when prefixed. In fact, the attention mechanism is uniquely suited for universal approximation with prefix-tuning a single attention head being sufficient to approximate any continuous function. Moreover, any sequence-to-sequence function can be approximated by prefixing a transformer with depth linear in the sequence length. Beyond these density-type results, we also offer Jackson-type bounds on the length of the prefix needed to approximate a function to a desired precision.

Image

When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations

Aleksandar Petrov, Philip H.S. Torr, Adel Bibi

International Conference on Learning Representations (ICLR) 2024

[Abstract] [arXiv]

Context-based fine-tuning methods, including prompting, in-context learning, soft prompting (also known as prompt tuning), and prefix-tuning, have gained popularity due to their ability to often match the performance of full fine-tuning with a fraction of the parameters. Despite their empirical successes, there is little theoretical understanding of how these techniques influence the internal computation of the model and their expressiveness limitations. We show that despite the continuous embedding space being more expressive than the discrete token space, soft-prompting and prefix-tuning are strictly less expressive than full fine-tuning, even with the same number of learnable parameters. Concretely, context-based fine-tuning cannot change the relative attention pattern over the content and can only bias the outputs of an attention layer in a fixed direction. This suggests that while techniques like prompting, in-context learning, soft prompting, and prefix-tuning can effectively elicit skills present in the pretrained model, they cannot learn novel tasks that require new attention patterns.

Image

Language Model Tokenizers Introduce Unfairness Between Languages

Aleksandar Petrov, Emanuele La Malfa, Philip H.S. Torr, Adel Bibi

Conference on Neural Information Processing Systems (NeurIPS) 2023

[Abstract] [arXiv] [Project website]

Recent language models have shown impressive multilingual performance, even when not explicitly trained for it. Despite this, concerns have been raised about the quality of their outputs across different languages. In this paper, we show how disparity in the treatment of different languages arises at the tokenization stage, well before a model is even invoked. The same text translated into different languages can have drastically different tokenization lengths, with differences up to 15 times in some cases. These disparities persist across the 17 tokenizers we evaluate, even if they are intentionally trained for multilingual support. Character-level and byte-level models also exhibit over 4 times the difference in the encoding length for some language pairs. This induces unfair treatment for some language communities in regard to the cost of accessing commercial language services, the processing time and latency, as well as the amount of content that can be provided as context to the models. Therefore, we make the case that we should train future language models using multilingually fair tokenizers.

Image

Certifying Ensembles: A General Certification Theory with S-Lipschitzness

Aleksandar Petrov*, Francisco Eiras, Amartya Sanyal, Philip H.S. Torr, Adel Bibi*

International Conference on Machine Learning (ICML) 2023

[Abstract] [arXiv]

Improving and guaranteeing the robustness of deep learning models has been a topic of intense research. Ensembling, which combines several classifiers to provide a better model, has shown to be beneficial for generalisation, uncertainty estimation, calibration, and mitigating the effects of concept drift. However, the impact of ensembling on certified robustness is less well understood. In this work, we generalise Lipschitz continuity by introducing S-Lipschitz classifiers, which we use to analyse the theoretical robustness of ensembles. Our results are precise conditions when ensembles of robust classifiers are more robust than any constituent classifier, as well as conditions when they are less robust.

Image

HiddenGems: Efficient safety boundary detection with active learning

Aleksandar Petrov, Carter Fang, Khang Minh Pham, You Hong Eng, James Guo Ming Fu, Scott Drew Pendleton

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

[Abstract] [arXiv] [Presentation]

Evaluating safety performance in a resource-efficient way is crucial for the development of autonomous systems. Simulation of parameterized scenarios is a popular testing strategy but parameter sweeps can be prohibitively expensive. To address this, we propose HiddenGems: a sample-efficient method for discovering the boundary between compliant and non-compliant behavior via active learning. Given a parameterized scenario, one or more compliance metrics, and a simulation oracle, HiddenGems maps the compliant and non-compliant domains of the scenario. The methodology enables critical test case identification, comparative analysis of different versions of the system under test, as well as verification of design objectives. We evaluate HiddenGems on a scenario with a jaywalker crossing in front of an autonomous vehicle and obtain compliance boundary estimates for collision, lane keep, and acceleration metrics individually and in combination, with 6 times fewer simulations than a parameter sweep. We also show how HiddenGems can be used to detect and rectify a failure mode for an unprotected turn with 86% fewer simulations.


Image

Compositional Computational Systems

Aleksandar Petrov, supervised by Gioele Zardini, Andrea Censi, Emilio Frazzoli

Master thesis

[Abstract] [Thesis] [Slides] [Presentation]

We propose a collection of formal definitions for problems and solutions, and study the relationships between the two. Problems and solutions can be represented as morphisms in two categories, and the structure of problem reduction and problem-solving has the properties of a heteromorphic twisted arrow category (a generalization of the twisted arrow category) defined on them. Lagado, a compositional computational system built on a type-theoretic foundation that accounts for the resources required for computation is provided as an example.
This thesis furthermore provides the universal conditions for defining any compositional computational system. We argue that any problem can be represented as a function from the product of hom-sets of two semicategories to a rig (a kinded function) and that any procedure can also be represented as a similar kinded function. Combining all problems and procedures defined over the same subcategory of SemiCat via a solution judgment map results in a heteromorphic twisted arrow category called Laputa, which automatically provides problem-reducing and problem-solving properties.
The thesis illustrates the practical application of the theory of compositional computations systems by studying the representation of co-design problems from the theory of mathematical co-design as part of several different compositional computations systems. In the process, new results on the conditions for the solvability of co-design problems and their compositional category-theoretical properties are also presented.


Image

Integrated Benchmarking and Design for Reproducible and Accessible Evaluation of Robotic Agents

Jacopo Tani, Andrea F. Daniele, Gianmarco Bernasconi, Amaury Camus, Aleksandar Petrov, Anthony Courchesne, Bhairav Mehta, Rohit Suri, Tomasz Zaluska, Matthew R. Walter, Emilio Frazzoli, Liam Paull, Andrea Censi

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020

[Abstract] [arXiv]

As robotics matures and increases in complexity, it is more necessary than ever for robot autonomy research to be “reproducible”. Compared to other sciences, there are specific challenges to benchmarking autonomy, such as the complexity of the software stacks, the variability of the hardware and the reliance on data-driven techniques, amongst others. In this paper we describe a new concept for reproducible research, in which development and benchmarking are integrated, so that reproducibility is obtained “by design” from the beginning of the research/development processes. We provide the overall conceptual objectives to achieve this goal and then provide a concrete instance that we have built, the DUCKIENet. One of the central components of this setup is the Duckietown Autolab, a remotely accessible standardized setup that is itself also relatively low cost and reproducible. When evaluating agents, careful definition of interfaces allows users to choose among local vs. remote evaluation using simulation, logs, or the remote automated hardware setups. We validate the system by analyzing the repeatability of experiments run using the infrastructure and show that there is low variance across different robot hardware and across different remote labs.

Image

Learning Camera Miscalibration Detection

Andrei Cramariuc*, Aleksandar Petrov*, Rohit Suri, Mayank Mittal, Roland Siegwart, Cesar Cadena

IEEE International Conference on Robotics and Automation (ICRA) 2020

[Abstract] [arXiv] [Code]

Self-diagnosis and self-repair are some of the key challenges in deploying robotic platforms for long-term real-world applications. One of the issues that can occur to a robot is miscalibration of its sensors due to aging, environmental transients, or external disturbances. Precise calibration lies at the core of a variety of applications, due to the need to accurately perceive the world. However, while a lot of work has focused on calibrating the sensors, not much has been done towards identifying when a sensor needs to be recalibrated. This paper focuses on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras. Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric. Additionally, by training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera's intrinsic parameters is required or not.