I am a Postdoctoral Researcher in TVG @ Oxford.
I completed my Ph.D. at Show Lab @ NUS.
I work in Vision+Language, Video Understanding, and Intelligent Agents.
🌐 Homepage: qhlin.me
📧 Email: [email protected]
I am a Postdoctoral Researcher in TVG @ Oxford.
I completed my Ph.D. at Show Lab @ NUS.
I work in Vision+Language, Video Understanding, and Intelligent Agents.
🌐 Homepage: qhlin.me
📧 Email: [email protected]
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
[NeurIPS 2025 D&B] Open-source Multi-agent Poster Generation from Papers
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
Automatic Video Generation from Scientific Papers
Out-of-the-box (OOTB) GUI Agent for Windows and macOS