Log inSign up
Fiction.live
449 posts
user avatar
Fiction.live
@ficlive
Read and control interactive stories Talk to writers. Suggest your own ideas and debate with other fans. Vote for what happens next.
fiction.live
Joined November 2012
36
Following
955
Followers
  • user avatar
    Fiction.live
    @ficlive
    Apr 17, 2025
    OpenAI Strikes Back
    Image
    346K
  • user avatar
    Fiction.live
    @ficlive
    Jun 5, 2025
    Wow Google does it again! Gemini 2.5 Pro is super impressive. Amazing 192k result.
    Image
    53K
  • user avatar
    Fiction.live
    @ficlive
    Jul 10, 2025
    Grok 4 is at the SOTA on long context up to 192k. Gemini 2.5 Pro still edges out on 192k but Grok 4 was more consistent overall. Very very impressed, it's a GREAT model.
    Image
    51K
  • user avatar
    Fiction.live
    @ficlive
    Apr 6, 2025
    Updated Long context benchmark with Llama 4
    Image
    104K
  • user avatar
    Fiction.live
    @ficlive
    Jun 11, 2025
    o3-Pro is really good at not making mistakes at lower contexts, solid improvement overall. 192k still belongs to Gemini though.
    Image
    24K
  • user avatar
    Fiction.live
    @ficlive
    May 6, 2025
    Gemini 2.5 Pro Preview gives good results, but can't quite match the original experimental version.
    Image
    97K
  • user avatar
    Fiction.live
    @ficlive
    Apr 14, 2025
    Long Context benchmark updated with GPT-4.1. Looks like it's the "optimus" version instead of the better performing original quasar. The smaller versions are not usable in long context.
    Image
    93K
  • user avatar
    Fiction.live
    @ficlive
    Jun 21, 2025
    minimax-m1 tested on Fiction.liveBench Long Context. @minimax_ai They did it! Competitive with Gemini 2.5 Pro-preview 05-06.
    Image
    12K
  • user avatar
    Fiction.live
    @ficlive
    Jun 19, 2025
    Claude 4 Sonnet thinking close to SOTA at the testable range (thinking tokens put it over the line for 192k) But strangely opus is significantly worse. Why? Anyone else see this?
    Image
    5.4K
  • user avatar
    Fiction.live
    @ficlive
    May 23, 2025
    Expanded context length to 192k for openai models and gemini. gemini is still consistently decent even at that length, o3 falls off dramatically at 192k.
    Image
    12K
  • user avatar
    Fiction.live
    @ficlive
    Apr 17, 2025
    Replying to @ficlive
    fiction.live/stories/Fictio… Will be working on an even longer context and harder eval. DM me if you wanna sponsor.
    7.7K
  • user avatar
    Fiction.live
    @ficlive
    Apr 4, 2025
    New model on @openrouter tested on long context. Long context performance is legit!
    Image
    11K
  • user avatar
    Fiction.live
    @ficlive
    Apr 7, 2025
    Replying to @_arohan_
    Re-ran the bench, there was no real improvement.
    Image
    17K
  • user avatar
    Fiction.live
    @ficlive
    Jun 21, 2022
    As mentioned previously, migration started about ~5 hours ago and was expected to take around 6 hours. Unfortunately, it's going to take longer than previously expected since the database import is only at around 50%. For some reason, the import today is slower than the test. 🤷‍♂️

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement