Log inSign up
Daniel Geng
145 posts
user avatar
Daniel Geng
@dangengdg
Research @OpenAI, Sora videogen. Previously @UmichCSE, @GoogleDeepMind, @MetaAI, @berkeley_ai
dangeng.github.io
Joined August 2016
968
Following
1,511
Followers
  • Pinned
    user avatar
    Daniel Geng
    @dangengdg
    Nov 30, 2023
    Can you make a jigsaw puzzle with two different solutions? Or an image that changes appearance when flipped? We can do that, and a lot more, by using diffusion models to generate optical illusions! Continue reading for more illusions and method details 🧵
    Image
    00:00
    126K
  • user avatar
    Daniel Geng
    @dangengdg
    Dec 4, 2024
    What happens when you train a video generation model to be conditioned on motion? Turns out you can perform "motion prompting," just like you might prompt an LLM! Doing so enables many different capabilities. Here’s a few examples – check out this thread 🧵 for more results!
    Image
    00:00
    95K
  • user avatar
    Daniel Geng
    @dangengdg
    Apr 18, 2024
    What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
    Image
    00:00
    59K
  • user avatar
    Daniel Geng
    @dangengdg
    Jun 18, 2024
    I'm at CVPR presenting "Visual Anagrams" on - Tuesday: 10am, Poster #429 - Friday: Oral6B @ 1pm, Poster #118 (pm) Let me know if you want to chat! Also, we manufactured a bunch of these "jigsaws with two solutions." If you want one, just hunt me down in the conference hall :)
    Image
    00:00
    19K
  • user avatar
    Daniel Geng
    @dangengdg
    Jan 2, 2025
    I had a lot of fun helping put this problem set together -- if you're teaching diffusion models + computer vision, consider using this homework for your course! (links at end of @ryan_tabrizi's thread!)
    user avatar
    Ryan Tabrizi
    @ryan_tabrizi
    Jan 1, 2025
    Teaching computer vision next semester? Hoping to finally learn about diffusion models in 2025? Check out this diffusion project that we designed and test-drove this past semester at Berkeley and Michigan!
    Image
    00:00
    15K
  • user avatar
    Daniel Geng
    @dangengdg
    Feb 2, 2024
    Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”
    Image
    00:00
    16K
  • user avatar
    Daniel Geng
    @dangengdg
    Oct 2, 2024
    We will be presenting this tomorrow. Come say hi! Thursday 10:30am - 12:30pm, poster 78, at #ECCV2024
    user avatar
    Daniel Geng
    @dangengdg
    Apr 18, 2024
    What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.
    Image
    00:00
    7.7K
  • user avatar
    Daniel Geng
    @dangengdg
    Dec 13, 2024
    I'll be presenting "Images that Sound" today at #NeurIPS2024! East Exhibit Hall A-C #2710. Come say hi to me and @andrewhowens :) (@CzyangChen sadly could not make it, but will be there in spirit :') )
    user avatar
    Ziyang Chen
    Luma
    @CzyangChen
    May 21, 2024
    These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!
    Image
    00:00
    5.8K
  • user avatar
    Daniel Geng
    @dangengdg
    Jan 10, 2025
    Hey all, I'll be answering questions about our "Motion Prompting" paper on alphaXiv (@askalphaxiv) (it's like arXiv, but adds a discussion section, and I think is quite well built!): alphaxiv.org/abs/2412.02700…
    5.4K
  • user avatar
    Daniel Geng
    @dangengdg
    Nov 30, 2023
    Replying to @dangengdg
    See our website, paper, and code for more details (and more illusions)! Website: dangeng.github.io/visual_anagram… arXiv: arxiv.org/abs/2311.17919 Code: github.com/dangeng/visual… Colab notebook: colab.research.google.com/drive/1hCvJR5G… Big thanks to my collaborators @invernopark and @andrewhowens!
    2.3K
  • user avatar
    Daniel Geng
    @dangengdg
    May 21, 2024
    This is an image of Corgis, but when played as a spectrogram sounds like dogs barking! Really thankful I got the chance to work on this super fun project with first author @CzyangChen. Check out his thread for many more examples, and to see how they're made!
    Image
    00:00
    Image
    00:57
    user avatar
    Ziyang Chen
    Luma
    @CzyangChen
    May 21, 2024
    These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!
    6.4K
  • user avatar
    Daniel Geng
    @dangengdg
    May 7, 2024
    I'll be presenting this work at #ICLR2024 on Wednesday, 10:45am in Hall B, #81. Stop by if you're interested or reach out if you just want to chat!
    user avatar
    Daniel Geng
    @dangengdg
    Feb 2, 2024
    Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”
    Image
    00:00
    3.4K
  • user avatar
    Daniel Geng
    @dangengdg
    Apr 18, 2024
    Replying to @dangengdg
    We can also make these images that change when viewed in grayscale. Since the human eye can't see color under dim lighting, there is actually a physical mechanism for this illusion: these images change appearance when taken from a bright room to a dim one!
    Image
    00:00
    1.9K
  • user avatar
    Daniel Geng
    @dangengdg
    Dec 4, 2024
    Replying to @dangengdg
    First, we train a video model to take a first frame, text, and **point trajectories** — a super flexible representation of motion. We then construct **motion prompts** to feed to the model. For example, a really simple motion prompt allows a user to “interact” with an image.
    Image
    00:00
    3.1K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement