I am an Assistant Professor in AI at University of Liverpool. My primary research focuses on building intelligent agent systems capable of human-like learning, reasoning, and decision-making, spanning natural language processing, reinforcement learning, social and responsible AI, and embodied agents.
I received my Ph.D. from University of Technology Sydney, advised by Prof. Dacheng Tao, and then worked as a postdoc with Prof. Trevor Cohn at the University of Melbourne.
I had been a research scientist / intern at Tencent Robotics X / AI, CSIRO and Microsoft Research Asia before.
I am also Visiting Professor of Mathematics and Computer Science at Eindhoven University of Technology (TU/e).
Multiple papers accepted to AAAI 2026, covering Vision-Language Reasoning for Geolocalization (Geo-R), reinforcement learning and large language models.
MathOdyssey: Benchmarking mathematical problem-solving skills in large language models using Odyssey Math Data appears in Scientific Data, releasing a novel benchmark and dataset for evaluating mathematical reasoning and problem-solving abilities of LLMs, and used by leading AI labs (e.g. Google Gemini and others).
Open to taking on new PhD students (see also PhD opportunites). So feel free to email me with your CV, transcripts and a short research proposal. I have limited supervision capacity and am always happy to consider good ideas.
Research
Projects
Text-based games TL;DR: We consider language understanding and reasoning for agents in text-based games. Keywords: responsible AI, knowledge graphs, attention, RL, hierarchical RL.[project page]
Conversational AI TL;DR: We consider chatbots for dialogue generation and reasoning. Keywords: Language generation, persona.[project page]
Question & Answering TL;DR: We consider the reasoning process for question and answering problems. Keywords: Retrieval-augmented generation, open domain, knowledge graphs, graph neural networks[project page]
Reinforcement learning TL;DR: We propose new agents and environments for robotics and Game AI. Keywords: sparse/delayed rewards, sample efficient, multi-goal RL, continual learning.[project page]