Qinghua Liu

alt text 

I study training models at OpenAI.

I received my Ph.D. from Princeton, advised by Chi Jin. My PhD research focused on fundamental theories in reinforcement learning (RL), including partially observable RL (POMDP, PSR, Survey), multi-agent RL (Stochastic Game), and RL with large state spaces (Function Approximation). After that, I spent a wonderful year as a postdoctoral researcher at Microsoft Research NYC, where I explored language model research.

During summer 2022, I interned at DeepMind, working with Csaba Szepesvári and Gellért Weisz. Previously, I received a B.E. degree in Electrical Engineering and a B.S. degree in Mathematics from Tsinghua University.

[Google Scholar]

Models


I've co-trained GPT-5–series models (e.g. GPT-5, GPT-5.1) for multi-personality / personalization. Highlighted in


Selected Papers   (Show All)

(α-β order) denotes alphabetical authorship ordering