Haotian Liu (@imhaotian) / X

Haotian Liu

291 posts

Haotian Liu

@imhaotian

prev. @xAI @grok imagine/omni/vision, creator of #LLaVA, @MSFTResearch @UWMadison

Cupertino, CA

Joined December 2013

Haotian Liu
@imhaotian
Oct 8, 2025
My cat Muffin: the new @grok imagine upgrade is great! 🐱👍
00:00
785K
Haotian Liu
@imhaotian
Oct 5, 2025
The team has been working hard bringing upgrades to Grok Imagine in the past month. We're rolling out the 1st model upgrade today, with more to come. Try it out and join us if you want to help accelerate!
Elon Musk
@elonmusk
Oct 5, 2025
Try the new Grok Imagine by downloading the latest app version
1.8M
Haotian Liu
@imhaotian
Oct 6, 2025
"Imagine we begin again tonight." - @grok imagine
00:00
923K
Haotian Liu
@imhaotian
Oct 6, 2025
subscribe to the new @grok imagine!
00:00
14M
Haotian Liu
@imhaotian
Jan 31, 2024
🚀We are thrilled to release LLaVA-1.6, with improved reasoning, OCR, and world knowledge. It supports higher-res inputs, more tasks, and exceeds Gemini Pro on several benchmarks! 🤯 It maintains the data efficiency of LLaVA-1.5, and LLaVA-1.6-34B is trained ~1 day with 32 A100s.
248K
Haotian Liu
@imhaotian
Oct 6, 2023
🚀 LLaVA-1.5 is out! Achieving SoTA on 11 benchmarks, with simple mods to original LLaVA! Utilizes merely 1.2M public data, trains in ~1 day on a single 8-A100 node, and surpasses methods that use billion-scale data. 🔗arxiv.org/abs/2310.03744 🧵1/5
269K
Haotian Liu
@imhaotian
May 6, 2023
🚀Exciting news! Thanks to the LLaVA-Lightning, we're releasing LLaVA-MPT today, just a day after the release of (commerically usable) MPT from @MosaicML! Built on MPT-7B-Chat, it only takes 3 hours to open up the eyes of MPT models to grasp and reason about the visual world. 🧵
212K
Haotian Liu
@imhaotian
Jul 19, 2023
🧵1/ Exciting news! We've just released a major update for LLaVA, our open-source large multimodal model, with support for LLaMA-2, LoRA training with academia GPUs, higher resolution (336x336), 4-/8- inference, and more! 🚀🌋
203K
Haotian Liu
@imhaotian
Apr 13, 2024
Grok can see👀! Excited to share that I joined @xai last month, and it’s such a pleasure to work with a small, focused team and see how fast we can move! This is just the beginning.
xAI
@xai
Apr 13, 2024
👀 x.ai/blog/grok-1.5v
66K
Haotian Liu
@imhaotian
May 3, 2023
🚀Introducing LLaVA Lightning: Train a lite, multimodal GPT-4 with just $40 in 3 hours! With our newly introduced datasets and the efficient design of LLaVA, you can now turbocharge your language model with image reasoning capabilities, in an incredibly affordable way.🧵
00:00
Chunyuan Li
@ChunyuanLi
Apr 18, 2023
🔥Visual Instruction Tuning with GPT-4 ! We release LLaVA, a Language-and-Vision Assistant that exhibits some near multimodal GPT-4 level capabilities: - 🤖Visual Chat: 85% relative score of GPT-4 - 🧪Science QA on reasoning: New SoTA 92.53%, beats multimodal chain-of-thoughts
302K
Haotian Liu
@imhaotian
Dec 14, 2024
grok is sharper, faster, and available for everyone to try :)
xAI
@xai
Dec 14, 2024
x.ai/blog/grok-1212
22K
Haotian Liu
@imhaotian
Oct 25, 2025
We’ve been working hard and with this rate of progress, we’ll likely have prototypes covering almost all aspects for video gen ready by end of this year. And next year would be even more fun! Join us!!
Guodong Zhang
@Guodzh
Oct 25, 2025
proud of both the companion and imagine team. We didn't have any usable video gen models 3 months ago and most efforts started 3 months ago.
46K
Haotian Liu
@imhaotian
Jun 15, 2024
I'll be in Seattle 6/17-21 for #CVPR2024. If you're interested in multimodal (image/video/audio; understanding/generation), let's chat! @xai is hiring and apply at x.ai/career if you want to build multimodal models on the largest GPU cluster ever built!
25K
Haotian Liu
@imhaotian
Oct 12, 2023
We've just released all training data/scripts & evaluation scripts for LLaVA-1.5! ✅Training: github.com/haotian-liu/LL… ✅Evaluation: github.com/haotian-liu/LL… You can also try LLaVA v1.5 now on 🤗Space: huggingface.co/spaces/badayve….
Haotian Liu
@imhaotian
Oct 6, 2023
🚀 LLaVA-1.5 is out! Achieving SoTA on 11 benchmarks, with simple mods to original LLaVA! Utilizes merely 1.2M public data, trains in ~1 day on a single 8-A100 node, and surpasses methods that use billion-scale data. 🔗arxiv.org/abs/2310.03744 🧵1/5
GitHub - haotian-liu/LLaVA: [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards...
From github.com
62K