Log inSign up
bycloud
1,562 posts
Image
user avatar
bycloud
@bycloudai
I make youtube videos on cool AI research /// AI papers newsletter mail.bycloud.ai /// paper recap @TheAITimeline /// intuitiveai.academy
youtube.com/bycloudAI
Joined January 2020
797
Following
11.5K
Followers
  • user avatar
    bycloud
    @bycloudai
    Feb 27, 2025
    the grok-3 benchmark is pretty useful in comparing base models, so I added GPT-4.5
    Image
    8.6M
  • user avatar
    bycloud
    @bycloudai
    Jul 16, 2024
    I got a great trailer for yall
    Image
    00:00
    Image
    Image
    user avatar
    Mistral AI
    @MistralAI
    Jul 16, 2024
    mistral.ai/news/mathstral/ mistral.ai/news/codestral…
    104K
  • user avatar
    bycloud
    @bycloudai
    Jun 22, 2025
    didn’t know gwern’s comment has the ability to predict the future too
    Image
    59K
  • user avatar
    bycloud
    @bycloudai
    Aug 31, 2025
    imagine getting beaten by a chinese food delivering company at training an LLM (along with a banger tech report btw)
    Image
    79K
  • user avatar
    bycloud
    @bycloudai
    Feb 27, 2025
    Claude 3.7 is cool, but i still ended up using grok-3 somehow something's off about claude 3.7 and I just cant pinpoint why
    159K
  • user avatar
    bycloud
    @bycloudai
    Jun 29, 2025
    many such cases
    Image
    33K
  • user avatar
    bycloud
    @bycloudai
    Jan 17, 2025
    someone has finally done it test time compute + diffusion models a really interesting one for sure 🧵
    Image
    57K
  • user avatar
    bycloud
    @bycloudai
    Nov 27, 2024
    the 4 horsemen of OpenAI apocalypse has now been assembled
    Image
    44K
  • user avatar
    bycloud
    @bycloudai
    Apr 14, 2025
    no model is able to escape the 66% accuracy @ 120k tokens, except Gemini 2.5 Pro which sits at 90% even the new GPT-4.1 with 1 mil ctx is stuck at 60%... (please tells us your secret gemini🥺)
    user avatar
    Fiction.live
    @ficlive
    Apr 14, 2025
    Long Context benchmark updated with GPT-4.1. Looks like it's the "optimus" version instead of the better performing original quasar. The smaller versions are not usable in long context.
    Image
    88K
  • user avatar
    bycloud
    @bycloudai
    Feb 27, 2025
    how does DeepSeek V3 win against GPT-4.5? (NOT R1 btw) openAI claimed that GPT-4.5 is a VERY big model, yet GPT-4.5 falls short compared to DeepSeek-V3 What.
    Image
    73K
  • user avatar
    bycloud
    @bycloudai
    Oct 21, 2024
    super interesting read maybe we just need to find the rules that are class 4 equivalent when generating synthetic data to get better performance on reasoning making a video on this now😳
    Image
    37K
  • user avatar
    bycloud
    @bycloudai
    Apr 6, 2025
    what also intrigued me about this is that @ 120k context window, 2.5 pro did a 90% accuracy while no one else crossed 66% everyone else starts to fall off hard @ 4k what new attention technique did google invent??? (and why is there a sudden dip at 16k???????)
    Image
    user avatar
    michelle
    @michellechen
    Apr 6, 2025
    Replying to @michellechen
    small tangent - people always ask about gemini context window, yeah it’s big, it probably uses some sliding window-like architecture too (don’t quote me). most notably though, google has it’s own proprietary accelerators called TPUs. much more GPU memory, so they can fit larger
    55K
  • user avatar
    bycloud
    @bycloudai
    Jun 17, 2025
    is this what anthropic did to make their non-reasoning models so good?
    Image
    43K
  • user avatar
    bycloud
    @bycloudai
    Jun 24, 2025
    after making diffusionLMs, you are telling me we are now adding U-Nets?
    Image
    39K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement