Log inSign up
Kaggle
5,599 posts
Image
user avatar
Kaggle
@kaggle
Kaggle is the largest global AI community of developers, researchers, and enthusiasts who compete, collaborate, and benchmark what's next in AI.
San Francisco
kaggle.com
Joined October 2009
284
Following
316.8K
Followers
  • user avatar
    Kaggle
    @kaggle
    Jun 17
    The PokΓ©mon Trading Card Game AI Battle Challenge is now live - in partnership with @Pokemon_cojp πŸ“’ Build AI Training Agents through two connected competitions focused on strategic gameplay in the PokΓ©mon Trading Card Game environment. Develop systems that can adapt to complex
    22K
    user avatar
    Kaggle
    @kaggle
    Jun 17
    Prize pool: $240,000 (awarded through the strategy category track) Simulation entry deadline: August 9, 2026 Strategy Category entry deadline: September 6, 2026
    4.9K
    user avatar
    Kaggle
    @kaggle
    Jun 17
    Simulation Category: kaggle.com/competitions/p… Strategy Category: kaggle.com/competitions/p…
    Image
    kaggle.com
    The PokΓ©mon Company - PTCG AI Battle Challenge Simulation
    Build an AI Training Agent to play the PokΓ©mon Trading Card Game
    3.4K
  • user avatar
    Kaggle
    @kaggle
    Jun 12
    The AI Agent Security - Multi-Step Tool Attacks simulation is now live! In partnership with @OpenAI, @Google and @IEEEorg, your challenge is to build an attack algorithm that stress-tests tool-using AI agents in a deterministic offline benchmark.
    9.1K
    user avatar
    Kaggle
    @kaggle
    Jun 12
    $50,000 prize pool Entry Deadline: August 25, 2026 Learn More πŸ‘‡
    Image
    kaggle.com
    AI Agent Security - Multi-Step Tool Attacks
    Develop attack algorithms to identify reproducible multi-step failures in tool-using AI agents.
    4.5K
  • user avatar
    Kaggle
    @kaggle
    Jun 10
    1H-VideoQA is now available on Kaggle Benchmarks! Developed by @GoogleDeepMind back in 2024 (@AntoineYang2) and now updated with latest SOTA models, 1H-VideoQA is a 101-prompt benchmark for long-context video comprehension and temporal episodic reasoning across hour-long YouTube
    Image
    6.7K
    user avatar
    Kaggle
    @kaggle
    Jun 10
    Top of the leaderboard: πŸ₯‡Gemini 3.5 Flash: 80.2% πŸ₯ˆGemini 3 Flash Preview: 79.2% πŸ₯‰Gemini 2.5 Pro: 78.9% Models must process raw frames as native tokens to locate seconds-long events hidden in an hour of video; accuracy scales logarithmically with frame density.
    3.9K
    user avatar
    Kaggle
    @kaggle
    Jun 10
    Check out the leaderboard here πŸ‘‡
    A long-context multimodal benchmark to measure long-form video understanding.
    VideoQA Leaderboard | Kaggle
    From kaggle.com
    2.8K
  • user avatar
    Kaggle
    @kaggle
    Jun 10
    Last call to sign up! πŸ“’ Registration closes on June 12, 11:59pm PT. Don't miss this no-cost course featuring Google expert-led theory sessions, hands-on labs, a capstone challenge, and a global community of learners. Register now:πŸ‘‡
    user avatar
    Kaggle
    @kaggle
    Apr 21
    Registration is now open for the 5-Day AI Agents: Intensive Vibecoding Course with @Google πŸš€ This no-cost course is designed to help builders learn how to design, build, and use AI agents using the latest concepts, technologies and skills.
    Promotional graphic for a "5-Day AI Agents Intensive Vibecoding Course with Google" by Kaggle and Google, scheduled for June 15 - 19, 2026. The image features playful illustrations of a laptop, coding icons, and abstract geometric shapes.
    Image
    kaggle.com
    5-Day AI Agents: Intensive Vibe Coding Course With Google
    June 15 - 19, 2026
    8.5K
  • user avatar
    Kaggle
    @kaggle
    Jun 9
    Can AI handle the fog of war? 🌫️ We just launched Dark Hex, a Game Arena benchmark for imperfect-information Hex, which evaluates strategic deduction, probing, and decision-making under uncertainty. Across 2,424 games, the first mover wins 61.6% of the time, and several models
    An infographic from Kaggle titled "Dark Hex Benchmark Top 5" features a leaderboard table comparing five AI models based on internal Game Arena Elo, average output tokens, and average total cost per request. GPT-5.5 ranks first with the highest Elo of 577, followed by Gemini 3.5 Flash, GPT-5.4, Gemini 3 Flash Preview, and Gemini 3.1 Pro Preview.
    4.9K
    user avatar
    Kaggle
    @kaggle
    Jun 9
    Check out the full 19-model breakdown and gameplay replays:πŸ‘‡
    user avatar
    Kaggle
    @kaggle
    Jun 9
    Article cover image
    Article
    Dark Hex: A New Game Arena Benchmark for LLM Reasoning
    Kaggle's Game Arena benchmark measures frontier model reasoning capabilities through dynamic game environments. Today we are adding a new game to the arena: Dark Hex. Dark Hex is an...
    3.7K
  • user avatar
    Kaggle
    @kaggle
    Jun 9
    Article cover image
    Article
    Dark Hex: A New Game Arena Benchmark for LLM Reasoning
    Kaggle's Game Arena benchmark measures frontier model reasoning capabilities through dynamic game environments. Today we are adding a new game to the arena: Dark Hex. Dark Hex is an...
    7.1K
  • user avatar
    Kaggle
    @kaggle
    Jun 8
    Kagglers can now create DOIs (Digital Object Identifiers) for their competition solutions and project Writeups.πŸ”– These Writeups often contain genuine scientific contributionsβ€”including novel methods, new benchmarks, results cited in papers. A DOI, registered through DataCite,
    Image
    00:00
    17K
    user avatar
    Kaggle
    @kaggle
    Jun 8
    Learn more πŸ‘‡
    Image
    kaggle.com
    DOIs for Competition and Project Writeups | Kaggle
    Hi Kagglers, We're happy to announce that Kaggle Writeups now supports DOIs (Digital Object Identifiers) registered through DataCite. Competition Writeups of...
    2.7K
  • user avatar
    Kaggle
    @kaggle
    Jun 4
    Show us how you'd take an idea and turn it into a working benchmark. We're picking 5 submissions to win exclusive swag and a social shoutout. How to enter: 1️⃣ Build a task locally with the write-kaggle-benchmarks skill 2️⃣ Push it to Kaggle Benchmarks and run it 3️⃣ Post your Task
    6.8K
    user avatar
    Kaggle
    @kaggle
    Jun 4
    Get started πŸ‘‡
    Image
    kaggle.com
    AI Benchmarks β€” Evaluate Models & Agents | Kaggle
    Build, run, and share benchmarks for evaluating AI models and agents. Crowdsourced by the AI research community on Kaggle.
    2.5K
  • user avatar
    Kaggle
    @kaggle
    Jun 4
    Earlier today we released local development for Kaggle Benchmarks. πŸš€ You can now write, validate and run AI evaluation tasks directly from your preferred dev environment β€” VSCode, Antigravity, Claude Code, and more. Go from idea to working eval using natural language with the
    Image
    00:00
    16K
    user avatar
    Kaggle
    @kaggle
    Jun 4
    Drop the skill into your agent to get started πŸ‘‡
    Image
    kaggle-skills/write-kaggle-benchmarks/SKILL.md at main Β· Kaggle/kaggle-skills
    From github.com
    3K
  • user avatar
    Kaggle
    @kaggle
    Jun 3
    Gemma 4 12B is now on Kaggle Models! πŸ€– Learn more: πŸ‘‰
    user avatar
    Google Gemma
    Google for Developers
    @googlegemma
    Jun 3
    Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: πŸ‘‡
    Image
    Image
    kaggle.com
    Google | Gemma 4 | Kaggle
    Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.
    9.1K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

TermsΒ·PrivacyΒ·CookiesΒ·AccessibilityΒ·Ads InfoΒ·Β© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement