UTA Agentic chatbot

Inspiration

Our inspiration came from observing the daily digital hurdles students face at the University of Texas at Arlington. Information is often scattered across multiple websites, and simple tasks like booking a study room, finding directions, or sending a formal email require navigating different interfaces. We wanted to streamline these processes. Our goal was to create a single, intelligent, and conversational assistant that acts as a central hub for campus life—an AI companion that doesn't just answer questions but actively helps students get things done, saving them time and effort so they can focus on their studies.

What it does

The UTA Agentic Chatbot is a multi-talented assistant that takes action on a student's behalf. It's more than a simple Q&A bot; it's an agentic system that automates tasks. Its core capabilities include:

Answering University Questions: It uses the Google Custom Search API to find and summarize information directly from official UTA websites.

Automating Study Room Bookings: It guides the user to provide all necessary details (name, date, time, number of people) and then automatically fills out a booking form to reserve the room.

Sending Emails: It can draft and send emails for the user. It first shows a draft for approval before sending it via Gmail, perfect for contacting professors or departments.

Calculating Distances: Using the Google Maps API, it can provide driving times and distances to and from campus or any other location.

Understanding Voice Commands: It accepts both typed and spoken input, using Whisper for accurate speech-to-text transcription.

How we built it

We built this project on a modern agentic AI stack, with Google's technologies at the core.

The Brain (LLM): We used Google's Gemini (gemini-1.5-flash) as the central reasoning engine. It interprets the user's request and decides which action to take.

The Framework (Orchestration): LangChain was used to structure the application. We created a "Tool-Calling Agent" that connects the Gemini model to a set of custom tools.

The Tools (Actions): Each of the agent's skills is a "tool":

Web Automation for booking was built with Playwright.

Q&A was powered by the Google Custom Search API and Beautiful Soup.

Navigation was enabled by the Google Maps Directions API.

Emailing was handled using Python's built-in smtplib library connected to a Gmail account.

The Interface: We used Gradio to build the interactive UI, complete with a chatbot display, text input, and a voice recording button. The application is deployed on Hugging Face Spaces.

Challenges we ran into

Our development process was a valuable learning experience, with two major challenges:

Web Scraping Firewalls: Our initial plan was to create a deeply integrated experience by automating the official UTA library website for study room bookings. However, we quickly discovered that the university's web servers have robust security measures and firewalls. Our automated requests using Playwright were consistently blocked, preventing us from accessing the booking portal directly. To overcome this, we adapted by creating a Google Form that mimicked the official booking process. This allowed us to successfully demonstrate the agent's powerful web automation capabilities in a controlled environment.

LLM Resource Restrictions: Working with a state-of-the-art model like Gemini during a hackathon meant we had to be very mindful of API rate limits and resource constraints. During heavy testing, we sometimes encountered limits, which forced us to think critically about efficiency. We optimized our system prompt and tool-calling logic to ensure we used the LLM judiciously, making our application more robust and scalable.

Accomplishments that we're proud of

Building a True Agent, Not Just a Chatbot: We are most proud of creating a system that performs actions. It doesn't just chat; it books rooms, sends emails, and actively searches for information, which is a significant step beyond traditional Q&A bots.

Successful Multi-Tool Integration: We successfully integrated a diverse set of APIs and technologies (Google Search, Google Maps, Playwright, Gmail) into a single, cohesive agent that can intelligently switch between tools based on the user's needs.

Problem-Solving and Adaptability: When faced with the firewall roadblock, we didn't give up. We quickly pivoted to a functional proof-of-concept using Google Forms, demonstrating our ability to adapt and deliver a working solution under pressure.

Accessible User Interface: We created an intuitive and accessible UI with Gradio that supports both text and voice, making our powerful AI agent easy for anyone to use.

What we learned

The Reality of System Integration: We learned that connecting AI to real-world, legacy web systems is a major challenge. Security firewalls are a practical barrier that requires either official API access or creative, ethical workarounds.

The Power of Agentic Frameworks: LangChain proved to be an incredibly powerful framework for building complex AI applications. It allowed us to focus on the logic of our tools while it handled the complex orchestration of prompts, tools, and the LLM.

Effective Prompt Engineering: We learned that the quality of the agent's performance is directly tied to the clarity of its system prompt and tool descriptions. Writing precise instructions for the LLM is critical for ensuring it behaves as expected.

What's next for UTA Agentic chatbot

Official University Integration: Our top priority is to move beyond the proof-of-concept. We plan to present our project to the UTA IT department to pursue official API access to campus systems. This would allow for a seamless and robust booking experience.

Expansion of Tools: We want to add more tools to make the assistant even more useful, such as integrating with the university's event calendar, the course registration system, and campus dining menus.

Personalization with Memory: We plan to add a persistent memory feature using a vector database. This would allow the agent to remember user preferences and past conversations for a more personalized and context-aware experience.