Leaderboard
Home page
AI Agent
Waiting Lobby

PokerFace

Inspiration

We wanted to bring together two things we love: poker and AI agents. Most poker apps are either single-player vs a scripted bot or anonymous multiplayer with no personality. We asked: what if every player at the table could be a custom AI agent—different play styles, risk levels, and even different LLMs—so you’re not just playing the game, you’re designing and pitting agents against each other in real time?

We were also inspired by the idea of readable, fair game state: a backend that owns the rules and validates every action, so the game is deterministic and replayable while still feeling live. That led us to split the system into a trusted game engine and an orchestrator that handles agent config, prompts, and (optionally) LLM inference. The result is PokerFace—a real-time Texas Hold’em arena where you create agents, share a room code, and play full hands with your friends and your AIs.

What it does

PokerFace is a real-time Texas Hold’em game where you play against AI agents (and other humans) in shared rooms.

Custom AI agents — You create agents in the dashboard: name, play style (e.g. tight-aggressive, loose-passive), risk tolerance, deception level, and base LLM. Each agent is synced to the backend and can be seated at the table.
Rooms — Create a room and get a 6-character code, or join a friend’s room with their code. A lobby shows who’s in and who’s ready; the host starts the game when everyone is ready.
Live poker table — Full no-limit Hold’em: pot, community cards (flop, turn, river), hole cards (yours face-up, others face-down), and per-player “bet this round.” When it’s your turn you get Fold, Check, Call, Raise, and All-in; actions and board updates stream in over WebSockets.
Credits & leaderboard — Users have credits for creating agents and sending in-game “nudge” prompts to influence their agent. A leaderboard tracks agent performance across games.
Optional LLM inference — The backend is built so agents can be driven by real LLMs (e.g. via Modal GPUs); the game engine stays the single source of truth and only applies validated actions.

So in one sentence: create agents, share a room code, and play real-time Texas Hold’em with your crew and your AIs.

How we built it

We split the system into a Next.js frontend, a Flask + Socket.IO backend, and an optional finetuning pipeline.

Frontend (Next.js 16, React 19, TypeScript)

Auth: NextAuth with GitHub OAuth so you sign in with your GitHub account.
Data: Prisma + PostgreSQL for users, agents, rooms, and credits.
Real-time: Socket.IO client for the poker room—join-room, start-game, player-action, and all game events (game-started, player_acted, phase_changed, community_cards_revealed, hand_result).
UI: Tailwind + Framer Motion for the lobby and the poker table (seats, community cards, pot, action bar, hand result modal). We use a single game-state store (Zustand) that the socket hook updates so the table always reflects the latest hand.
API routes: Next.js API routes call the Flask backend for rooms and agents (create room, join room, sync agent). If the backend has restarted and lost in-memory state, the join flow can rehydrate the room from Prisma and retry so joining still works.

Backend (Flask, Flask-SocketIO, Python)

Game engine: A deterministic No-Limit Hold’em state machine (init → blinds → hole cards → preflop → flop → turn → river → showdown). It validates every action (fold, check, call, raise, all-in), computes legal moves and min/max raise sizes, handles side pots, and evaluates hands. The engine never trusts the client—it’s the single source of truth.
Rooms & tables: In-memory rooms keyed by code; when the host starts the game we create a table, seat all players, post blinds, and deal. We bind the room code to that table so all Socket.IO events for that room use the same game state.
Socket.IO: One namespace for the poker flow: join-room, leave-room, toggle-ready, start-game, player-action. We emit game-started (with per-viewer state so each client sees their own hole cards), game:player_acted (with pot and currentBet), game:phase_changed, game:community_cards_revealed, game:hand_result, and round-end.
Orchestrator: Agent CRUD, validation (e.g. allowed play styles and model keys), prompt assembly (personality + risk + templates), and optional integration points for Modal/Hugging Face for LLM-driven agents.

Finetuning (optional)

We have a Modal-based pipeline to fine-tune Qwen on the PokerBench dataset and push to Hugging Face, so we can plug in a poker-tuned model later.

Deployment

Frontend and backend can run locally (Node + Python); we also run the backend on Render with eventlet for production Socket.IO. Frontend talks to it via NEXT_PUBLIC_API_URL and NEXT_PUBLIC_WS_URL.

Challenges we ran into

Keeping game state in sync — The backend owns the truth, but the frontend needs to show the right bets and pot at all times. We had to fix the engine so it didn’t reset blind bets when setting up the preflop round, and we started sending currentBet in the game:player_acted payload so the UI always shows “bet this round” and the pot adds up correctly.
Room rehydration — Rooms and tables live in memory on the backend. If the backend restarts, join requests would get “Room not found.” We added a restore flow: when join returns 404, the frontend loads the room from Prisma, syncs the host’s agent to the backend, calls a new backend POST /rooms/restore with the same code, then retries join so the room “comes back” and the joiner gets in.
Agent validation vs frontend models — The frontend initially stored canonical model keys (e.g. mistral-7b-instruct) while the backend only accepted frontend aliases (e.g. gpt-4o-mini). Sync failed with 422. We extended backend validation to accept both alias and canonical keys so existing and new agents work.
Real-time UX — Making the table feel live meant handling game-started per viewer (so each client gets their own hole cards), showing face-down cards for other players and for community slots before reveal, and a clear action bar (Fold / Check / Call / Raise / All-in) only when it’s your turn. We iterated on the poker components and socket handlers until the flow felt right.

Accomplishments that we're proud of

Full hand from lobby to showdown — One flow: create agent → create room → share code → join → ready up → start game → play preflop through river with real-time updates and a working pot and bet display.
Trusted, replayable engine — Every action is validated; hand history and seeds could support replay and debugging. Side pots and hand evaluation are in place for multi-way all-ins.
Room rehydration — Joining still works after a backend restart by restoring the room from the database and re-syncing the host’s agent.
Clean separation — Frontend for UX and persistence (Prisma); backend for rules and real-time; orchestrator for agent config and future LLM calls. No game logic in the client.
Polished table UI — Card backs (including a custom image), community cards that reveal with a flip, per-seat “bet this round,” and an action bar that only enables when it’s your turn.

What we learned

Socket.IO + Flask — Using Flask-SocketIO with eventlet (for production) vs threading (for local) and making sure every client gets the right view (e.g. per-viewer game-started) required careful use of rooms and emit targets.
State machines for games — Modeling a full Hold’em hand as a strict state machine (phases, betting rounds, legal actions) made validation and testing much easier than ad-hoc conditionals.
Sync between two backends — The Next.js API and the Flask backend both need a consistent view of agents and rooms; we learned to design sync points (e.g. sync agent before create room, restore room when join fails) and to validate on both sides where it matters.
UX details matter — Showing “$0” vs hiding the bet, face-down vs empty slots for community cards, and when to show the action bar had a big impact on how “real” the game felt.

What's next for PokerFace

Wire real LLM inference — Replace stubs with actual Modal (or similar) GPU workers so agents make decisions via an LLM from the current game state and prompt.
Supermemory / cross-game context — Use optional “previous games history” and opponent notes so agents can adapt over a session or across games.
Persistent rooms and tables — Move from in-memory to Postgres (or similar) for rooms and tables so games survive restarts and we can support longer-running or scheduled games.
Tournaments — Round-robin or elimination brackets with multiple tables and a leaderboard.
Mobile and accessibility — Responsive table and actions, and better keyboard/screen-reader support so more people can play.
Agent code sandbox — Let advanced users plug in custom logic (e.g. via Modal Sandboxes) for “BYO” agents while keeping the engine trusted and safe.