I'm a third-year PhD student in computer science at Cornell, where I'm fortunate to be advised by Raaz Dwivedi and Kilian Q. Weinberger.
I'm currently working on using distribution compression (a.k.a. "thinning") to speed up training and inference of large-language models (LLMs). My longer term goal is to enable LLM agents to efficiently search over large and dynamic datastores.
Tl;dr—Developed an agentic workflow for multi-step question-answering that coordinates multiple LLM calls to achieve superior reasoning capability over vanilla Chain-of-Thought.
Tl;dr—Developed new analysis of thinning algorithms that adapts to low-rank structures, enabling faster dot-product attention in Transformers (Thinformer), stochastic gradient descent (KH-SGD), and deep kernel hypothesis testing (DeepCTT).