Inspiration

This project aims to address the pain points people face when planning and executing trips abroad, particularly in group settings. Key challenges include coordination among friends, language barriers, information gathering about costs/weather/bookings, and avoiding tourist traps. The proposed solution is a Telegram chatbot that can be added to group chats to provide relevant information, assist with bookings, and communicate with external agents.

What it does

Our solution uses a Telegram bot framework with a hub-and-spoke architecture:

  • Main Bot: Central interface for user interactions in Telegram
  • Tool Integration: Specialized modules for each sponsor tool
  • Agent Runner: Core decision-making system to determine which tools to use

The Telegram bot will operate with these components:

  1. Telegram Bot API: Interface with users in group chats
  2. LangChain Framework: For agent orchestration and tool usage
  3. Google Gemini Integration: For natural language understanding and generation
  4. Sponsor Tool APIs: Direct integration with Perplexity, DeepL, Rime/Vapi, and Apify

Key Features

  1. Group Chat Integration: Bot can be added to group chats to assist with trip planning
  2. Multimodal Input: Support for text messages, images, and voice messages
  3. Autonomous Tool Selection: Bot decides which tool to use based on user queries
  4. Context Awareness: Maintains context of the conversation for more relevant responses
  5. Translation Support: Provides translations when needed for language barriers
  6. Reservation Assistance: Can make phone calls for bookings and reservations
  7. Travel Information: Provides up-to-date information about destinations, events, costs
  8. Flight Search: Helps users find flight options and deals
  9. Points of Interest: Recommends attractions and activities at destinations

How we built it

We built Voyagent using a modular hub-and-spoke architecture. We started by setting up a Telegram bot with a Flask webhook server to receive user input. We integrated LangChain as the core agent framework to orchestrate reasoning and tool usage. Google Gemini powers the natural language understanding and generation tasks. We connected several sponsor APIs as tools:

  • Perplexity Sonar for real-time search queries
  • DeepL for translations
  • Apify Actors for flight and points-of-interest searches
  • Rime Voice Agent for making real-world reservation calls
  • We designed the system to autonomously select the right tool based on user queries and maintain conversational context across interactions.

Built With

Share this project:

Updates