My research interests encompass generative AI and multimodal foundation models, such as diffusion models, vision-language models, and representation learning. I am interested in exploring their interpretability, controllability, and unification.
SpaceTools empowers VLMs with vision and robotic tools
for spatial reasoning via Double Interactive Reinforcement Learning (DIRL),
enabled by our Toolshed infrastructure.
Achieves state-of-the-art performance on
spatial reasoning benchmarks and enables precise real-world robot manipulation.
We introduce a novel framework for auditing and improving the robustness of unlearned diffusion models by proposing an interpretable subspace attack, that reveals latent vulnerabilities and inspires a corresponding projection-based defense to surgically remove them.
We enable localized, transferable, linear, and composable image editing on diffsion models by exploring their low-rank and locally linear semantic spaces.
Inspired by physical motion, we unfold a video clip via Taylor expansion and design an alternative algorithm for self-supervised video representation learning. Our proposed method can steer the model to dynamic parts in the video.
We provide theoretical insights into the connection between diffusion model and subspace clustering, which sheds light into the transition of diffusion model from memorization to generalization.
Self-designed Game: Asylum 7 Siyi Chen, Yigao Fang, Dawei Wang, Zhongqian Duan, Ruipu Li
Course Project, University of Michigan, 2022
Advisor: Austin Yarger
Play it here
As a team of five, we designed a horror game, Asylum 7, with Unity.
We generate 3D synthetic video dataset containing a moving object and a scene. The pose and position of the object is optimized via a differential render.
We predict 3D partial human poses as SMPL meshes, predict 2D plane masks as well as 3D articulation information, and use a differential render to optimize the position and pose of the person considering 3d space interactions.
We designed and implemented beta versions of enumerating origamis in H(2) and utilized SageMath to implement the convexity test presented by Lelievre and Weiss.
Design A Roller Coaster Siyi Chen, Yigao Fang, Qi Shen
Gold Medal Winner (Top 2%), The University Physics Competition 2019
paper
We devise a rule to evaluate the safety and difficulty level of roller coasters, propose a novel roller coster model, and give a through analysis based on Euler's method and natural axes.