Selected Publications
I'm interested in robotics, deep learning, 3d vision, and computer system. Some papers are highlighted .
Human-oriented Representation Learning for Robotic Manipulation
Mingxiao Huo , Mingyu Ding, Chenfeng Xu, Thomas Tian, Xinghao Zhu, Yao Mu, Lingfeng Sun, Masayoshi Tomizuka, Wei Zhan
RSS , 2024 (Oral)
project page
/
arXiv
/
paper
Keywords: Manipulation, Representation Learning, Computer Vision
Sparse Diffusion Policy: A Sparse, Reusable, and Flexible Policy for Robot Learning
Mingxiao Huo* , Yifei Zhang*, Yixiao Wang*, Thomas Tian, Xiang Zhang, Yichen Xie, Chenfeng Xu, Pengliang Ji, Wei Zhan, Mingyu Ding, Masayoshi Tomizuka
CoRL , 2024
project page
/
arXiv
/
paper
/
code
Keywords: Manipulation, Mixure of Experts, Policy Learning
Joint Pedestrian Trajectory Prediction through Posterior Sampling
Haotian Lin, Yixiao Wang, Mingxiao Huo , Chensheng Peng, Zhiyuan Liu, Masayoshi Tomizuka
IROS , 2024 (Oral Pitch)
project page
/
arXiv
Keywords: Diffusion Model, Motion Prediction
AbHE: All Attention-Based Homography Estimation
Mingxiao Huo , Zhihao Zhang, Xinyang Ren, Xianqiang Yang, Chao Ye
IEEE Transactions on Instrumentation and Measurement(TIM) , 2023
code
/
arXiv
Keywords: 3d Vision, Generative Model
An Empirical Study on Tree-based Speculative Decoding
Mingxiao Huo
2024 (in submission)
code(coming soon)
Keywords: Computer system, LLM serving
Robust Multi-Object 4D Generation for In-the-wild Videos
Wen-Hsuan Chu*, Lei Ke*, Jianmeng Liu*, Mingxiao Huo , Pavel Tokmakov, Katerina Fragkiadaki
CVPR , 2025
arxiv(coming soon)
Keywords: 3d Vision, 4d Reconstruction
Interesting Projects
I have done some interesting projects in computer system, ai and mechanics design.
Self-driving trash bin car
Keywords: Mechanics Design, Computer vision, Control System
video
TL;DR: A self driving car build from scratch, with motors to control speed and direction. Using computer vision algorithm to detect trash and react with controller.
A parallel render
Keywords: Cuda, Computer grapics, Parallel processing
code
TL;DR: A cuda based image rending with parallel processing in circle base.
Tactile audio imitation learning
Keywords: Tactile sensor, Audio processing
TL;DR: An imitation learing pipeline with tactile sensor and audio feedback.
UDP transport layer
Keywords: Computer network
code(coming soon)
TL;DR: A reliable transport layer based on UDP, which can transfer content files between mulJple
peers.