Ruizhe Shi



About me

I am a first-year Ph.D. student in Paul G. Allen School of Computer Science & Engineering at the University of Washington. I'm fortunate to be advised by Simon Du and Banghua Zhu. Previously, I received my bachelor's degree in Computer Science (Yao class), with a minor in Literature, from Tsinghua University.


Contact



Selected Papers

Understanding the Performance Gap in Preference Learning: A Dichotomy of RLHF and DPO.
Ruizhe Shi, Minhak Song, Runlong Zhou, Zihan Zhang, Maryam Fazel, Simon S. Du

Decoding-Time Language Model Alignment with Multiple Objectives.
Ruizhe Shi, Yifang Chen, Yushi Hu, Alisa Liu, Hannaneh Hajishirzi, Noah A. Smith, Simon S. Du

Rethinking Transformers in Solving POMDPs.
Chenhao Lu, Ruizhe Shi, Yuyao Liu, Kaizhe Hu, Simon S. Du, Huazhe Xu


Notes and Slides

Understanding the gaps between two-stage and direct preference-based policy learning. [slide]

The crucial role of samplers in online direct preference optimization. [slide][recording]

Logit mixing and RLHF paper reading. [slide]

Decoding-time language model alignment with multiple objectives. [slide][recording]