Voice is the least used interface on X - despite being the most magical! The scariest part of voice interactions with X is the big blank search bar that stares back into the user's soul. If we could solve this "cold start" for Voice interactions, we could open door to exponential growth for User Active Seconds in Grok Voice mode.

We made Last Week on X to get users using Grok Voice mode instantly. When a user opens X, we imagine our helpful assistant Ara will walk the user through what they missed this Last Week on X.

What

Last Week on X generates a podcast from the last 7 days of your timeline posts. It then synthesizes this information into a lively & coherent podcast transcript. This transcript is then voiced by Ara to provide a personalized narration of the world events most interesting to them this week.

How we built it

We used Amazon Strands SOP framework, Bun/NextJS, Grok + X API

Tech Stack

  • Frontend: Next.js 16 with TypeScript, React App Router, Bun
  • Backend: Python Flask server
  • AI: Grok (xAI), Podcast transcript engine via Amazon Strands SOP framework
  • Authentication: OAuth 2.0 with PKCE
  • APIs: X API v2, Tweepy, Web Speech API

Architecture and Component Functions

  1. Authentication Layer (app/api/login/route.ts, app/api/callback/route.ts)

    • Implements OAuth 2.0 with PKCE to securely obtain X API access tokens
    • Next.js API routes handle token exchange and validation server-side
    • Stores access tokens in httpOnly cookies to prevent XSS attacks
    • app/api/user/route.ts fetches authenticated user profile data
  2. Data Collection (twitter_search.py, utils.py, Tweepy.ipynb)

    • twitter_search.py: Queries X API v2 for user's home timeline tweets
    • Tweepy: Python library wrapper for X API authentication and rate limit handling
    • Extracts tweet text, author information, timestamps, and engagement metrics
    • utils.py: Contains helper functions for data parsing and preprocessing
  3. Content Generation Pipeline

    • xai_strands_runner.py: Manages Grok API calls and prompt engineering
      • Sends preprocessed tweet data to Grok with conversation structure instructions
      • Handles API authentication and response parsing
    • strands_pipeline.py: Orchestrates the full generation workflow
      • Aggregates tweets by theme and temporal clustering
      • Structures input for Grok to generate cohesive narrative
      • Formats output into dialogue with speaker transitions
    • Grok (xAI): Generates conversational podcast script from tweet data
      • Analyzes tweet content, sentiment, and relationships
      • Creates natural dialogue flow between two speakers
      • Maintains context across multiple tweet topics
  4. Audio Synthesis (tts.py, app/podcast/page.tsx)

    • tts.py: Python TTS integration for server-side audio generation
    • Web Speech API: Browser-based fallback for client-side synthesis
    • Converts generated script into spoken audio with voice selection and rate control
  5. Web Application (app/)

    • page.tsx: Landing page with OAuth login flow
    • podcast/page.tsx: Main interface for podcast generation and playback
      • Triggers backend pipeline via API call
      • Displays generation progress with real-time status updates
      • Provides audio playback controls and regeneration option
    • globals.css: UI styling with animated backgrounds and glassmorphism effects
  6. API Integration (app/api/generate-podcast/route.ts)

    • Next.js API route bridges frontend requests to Python backend
    • Manages session state and user authentication verification
    • Returns generated podcast text and audio URL to client
  7. Chrome Extension (chrome-extension/)

    • Provides in-browser access to podcast generation
    • Injects UI elements directly into X platform pages
    • Uses same backend API for consistent functionality

Data Flow

User Login → OAuth Token → X API Timeline Fetch → 
Tweet Preprocessing → Grok Analysis → Script Generation → 
TTS Conversion → Audio Playback

Key Technical Decisions

  • Next.js: Unified server and client code, built-in API routes eliminate separate backend server for authentication
  • Python Backend: Required for X API libraries (Tweepy) and integration with existing AI/ML ecosystem
  • Grok API: Produces more natural conversational dialogue compared to generic language models, better context retention across tweet threads
  • OAuth 2.0 PKCE: Necessary for secure public client authentication without client secret exposure
  • Dual TTS: Python TTS for higher quality voices, browser fallback for zero-latency testing

The architecture is modular: Next.js handles authentication and user interface, Python processes data and AI generation, with APIs connecting the layers.

Share this project:

Updates