Daniel Geng (@dangengdg) / X

Daniel Geng

145 posts

Daniel Geng

@dangengdg

Research @OpenAI, Sora videogen. Previously @UmichCSE, @GoogleDeepMind, @MetaAI, @berkeley_ai

Joined August 2016

Pinned
Daniel Geng
@dangengdg
Nov 30, 2023
Can you make a jigsaw puzzle with two different solutions? Or an image that changes appearance when flipped? We can do that, and a lot more, by using diffusion models to generate optical illusions! Continue reading for more illusions and method details 🧵
00:00
126K
Daniel Geng
@dangengdg
Dec 4, 2024
What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!
00:00
95K
Daniel Geng
@dangengdg
Apr 18, 2024
What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
00:00
59K
Daniel Geng
@dangengdg
Jun 18, 2024
I'm at CVPR presenting "Visual Anagrams" on - Tuesday: 10am, Poster #429 - Friday: Oral6B @ 1pm, Poster #118 (pm) Let me know if you want to chat! Also, we manufactured a bunch of these "jigsaws with two solutions." If you want one, just hunt me down in the conference hall :)
00:00
19K
Daniel Geng
@dangengdg
Jan 2, 2025
I had a lot of fun helping put this problem set together -- if you're teaching diffusion models + computer vision, consider using this homework for your course! (links at end of @ryan_tabrizi's thread!)
Ryan Tabrizi
@ryan_tabrizi
Jan 1, 2025
Teaching computer vision next semester? Hoping to finally learn about diffusion models in 2025? Check out this diffusion project that we designed and test-drove this past semester at Berkeley and Michigan!
00:00
15K
Daniel Geng
@dangengdg
Feb 2, 2024
Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”
00:00
16K
Daniel Geng
@dangengdg
Oct 2, 2024
We will be presenting this tomorrow. Come say hi! Thursday 10:30am - 12:30pm, poster 78, at #ECCV2024
Daniel Geng
@dangengdg
Apr 18, 2024
What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
00:00
7.7K
Daniel Geng
@dangengdg
Dec 13, 2024
I'll be presenting "Images that Sound" today at #NeurIPS2024! East Exhibit Hall A-C #2710. Come say hi to me and @andrewhowens :) (@CzyangChen sadly could not make it, but will be there in spirit :') )
Ziyang Chen
@CzyangChen
May 21, 2024
These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!
00:00
5.8K
Daniel Geng
@dangengdg
Jan 10, 2025
Hey all, I'll be answering questions about our "Motion Prompting" paper on alphaXiv (@askalphaxiv) (it's like arXiv, but adds a discussion section, and I think is quite well built!): alphaxiv.org/abs/2412.02700…
5.4K
Daniel Geng
@dangengdg
Nov 30, 2023
Replying to @dangengdg
See our website, paper, and code for more details (and more illusions)! Website: dangeng.github.io/visual_anagram… arXiv: arxiv.org/abs/2311.17919 Code: github.com/dangeng/visual… Colab notebook: colab.research.google.com/drive/1hCvJR5G… Big thanks to my collaborators @invernopark and @andrewhowens!
2.3K
Daniel Geng
@dangengdg
May 21, 2024
This is an image of Corgis, but when played as a spectrogram sounds like dogs barking! Really thankful I got the chance to work on this super fun project with first author @CzyangChen. Check out his thread for many more examples, and to see how they're made!
00:00
00:57
Ziyang Chen
@CzyangChen
May 21, 2024
These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!
6.4K
Daniel Geng
@dangengdg
May 7, 2024
I'll be presenting this work at #ICLR2024 on Wednesday, 10:45am in Hall B, #81. Stop by if you're interested or reach out if you just want to chat!
Daniel Geng
@dangengdg
Feb 2, 2024
Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”
00:00
3.4K
Daniel Geng
@dangengdg
Apr 18, 2024
Replying to @dangengdg
We can also make these images that change when viewed in grayscale. Since the human eye can't see color under dim lighting, there is actually a physical mechanism for this illusion: these images change appearance when taken from a bright room to a dim one!
00:00
1.9K
Daniel Geng
@dangengdg
Dec 4, 2024
Replying to @dangengdg
First, we train a video model to take a first frame, text, and **point trajectories** — a super flexible representation of motion. We then construct **motion prompts** to feed to the model. For example, a really simple motion prompt allows a user to “interact” with an image.
00:00
3.1K