Junyi Zhang (@junyi42) / X

Junyi Zhang

210 posts

Junyi Zhang

@junyi42

CS Ph.D. Student @Berkeley_AI. B.Eng. @SJTU1896 CS. previous with @GoogleDeepMind, @MSFTResearch. Vision, generative model, robotics.

Joined July 2022

Pinned
Junyi Zhang
@junyi42
Mar 9
𝗢𝗻𝗲 𝗺𝗲𝗺𝗼𝗿𝘆 𝗰𝗮𝗻’𝘁 𝗿𝘂𝗹𝗲 𝘁𝗵𝗲𝗺 𝗮𝗹𝗹. We present 𝗟𝗼𝗚𝗲𝗥, a new 𝗵𝘆𝗯𝗿𝗶𝗱 𝗺𝗲𝗺𝗼𝗿𝘆 architecture for long-context geometric reconstruction. LoGeR enables stable reconstruction over up to 𝟭𝟬𝗸 𝗳𝗿𝗮𝗺𝗲𝘀 / 𝗸𝗶𝗹𝗼𝗺𝗲𝘁𝗲𝗿 𝘀𝗰𝗮𝗹𝗲, with
00:00
560K
Junyi Zhang
@junyi42
Oct 7, 2024
Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io
00:00
132K
Junyi Zhang
@junyi42
Feb 11, 2025
MonST3R is accepted by ICLR'25 as Spotlight! We have also added a fully feed-forward reconstruction mode that runs in real-time for video input (samples at: monst3r-paper.github.io/page0.html), check more details here: github.com/Junyi42/monst3…
00:00
22K
Junyi Zhang
@junyi42
Apr 21, 2025
Introducing St4RTrack!🖖 Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps! st4rtrack.github.io
00:00
52K
Junyi Zhang
@junyi42
Oct 21, 2024
Code for inference, visualization, training, and evaluation is released! -
Junyi Zhang
@junyi42
Oct 7, 2024
Excited to share MonST3R! -- a simple way to estimate geometry from unposed video of dynamic scene We achieve competitive results on several downstreams (video depth, camera pose) and believe this is a promising step toward feed-forward 4D reconstruction monst3r-project.github.io
00:00
GitHub - Junyi42/monst3r: Official Implementation of paper "MonST3R: A Simple Approach for Estima...
From github.com
22K
Junyi Zhang
@junyi42
May 21, 2025
Very impressive! At VideoMimic.net, we already: learn from 3rd-person human videos + RL -- for locomotion. Excited to see where this path goes next!
00:00
Milan Kovac
@_milankovac_
May 21, 2025
One of our goals is to have Optimus learn straight from internet videos of humans doing tasks. Those are often 3rd person views captured by random cameras etc.   We recently had a significant breakthrough along that journey, and can now transfer a big chunk of the learning
18K
Junyi Zhang
@junyi42
May 7, 2025
Humanoids need to perceive the environment in the real world Using 4D reconstruction techniques, we turn casual human videos into training data for an environment-aware humanoid policy Super excited to share: VideoMimic.net
Arthur Allshire
@arthurallshire
May 7, 2025
our new system trains humanoid robots using data from cell phone videos, enabling skills such as climbing stairs and sitting on chairs in a single policy (w/ @redstone_hong @junyi42 @davidrmcall)
00:00
11K
Junyi Zhang
@junyi42
Jun 11, 2025
Just arrived at Nashville for #CVPR25! 🥰 I'll present St4RTrack tomorrow morning (10:30–12:30) at the 4D Vision Workshop, poster #137 in Hall 104 B. Feel free to come and chat!
01:57
Junyi Zhang
@junyi42
Apr 21, 2025
Introducing St4RTrack!🖖 Simultaneous 4D Reconstruction and Tracking in the world coordinate feed-forwardly, just by changing the meaning of two pointmaps! st4rtrack.github.io
8.9K
Junyi Zhang
@junyi42
Mar 21, 2024
🚀Introducing “Telling Left from Right” at #CVPR2024 -🔍Identify the problem 𝐠𝐞𝐨metry-𝐚𝐰𝐚𝐫𝐞 semantic correspondence (SC) -📐Evaluate foundation model features’ geometric awareness -🏆Achieve SOTA with a lightweight post-processor 🔗 (w/ code!): telling-left-from-right.github.io
00:00
9.6K
Junyi Zhang
@junyi42
Jun 16, 2024
On my way to Seattle ✈️ for my first ever #CVPR! Excited to meet old and new friends. 😄 I'll be presenting our work telling-left-from-right.github.io on Wed. (19th) morning at #284. If you're interested in how a plug-in processor can enhance the Geo-aware SC of SD+DINO, please stop by.
7.1K
Junyi Zhang
@junyi42
Apr 24, 2025
I'll be presenting MonST3R at ICLR! 🇸🇬 Friday 25th, 10am-12:30pm Hall 3+2B #97 Come by if you are interested!
Junyi Zhang
@junyi42
Feb 11, 2025
MonST3R is accepted by ICLR'25 as Spotlight! We have also added a fully feed-forward reconstruction mode that runs in real-time for video input (samples at: monst3r-paper.github.io/page0.html), check more details here: github.com/Junyi42/monst3…
00:00
3.1K
Junyi Zhang
@junyi42
Nov 28, 2024
The results are so cool! 4D reconstruction is a very challenging task - I tried to explore it before MonST3R but couldn't make it work. I'm thrilled to see MonST3R contributing a part to this reconstruction pipeline!
Rundi Wu
@ChrisWu6080
Nov 28, 2024
🚀 Introducing CAT4D! 🚀 CAT4D transforms any real or generated video into dynamic 3D scenes with a multi-view video diffusion model. The outputs are dynamic 3D models that we can freeze and look at from novel viewpoints, in real-time! Be sure to try our interactive viewer!
00:00
5.2K
Junyi Zhang
@junyi42
Oct 7, 2024
Replying to @junyi42
Hard to see the details in the figure? Check it out for yourself 😍: monst3r-project.github.io/page1.html We’ve created an interesting 4D online demo that you can easily explore!
00:00
7K
Junyi Zhang
@junyi42
Mar 31, 2025
Nice work! Very cool results by carefully-designed generative inpainting on MonST3R's partial pointmaps. Glad to see MonST3R/dynamic 3d reconstruction is playing an important role.
00:22
Tianqi Liu
@TianqiLiu664
Mar 30, 2025
🔥Free4D creates explicit 4D Gaussian scene representations from a single image, enabling high-quality, controllable, and real-time rendering. 👉Project (with interactive demo): free4d.github.io Paper: arxiv.org/abs/2503.20785 Code (open-sourced): github.com/TQTQliu/Free4D
5.2K