Skip to content

Gcunhaa/hackhaton

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hackathon Implementation Task

An AI-powered networking assistant that extracts entities from meeting transcripts, builds a knowledge graph, and suggests valuable connections with personalized email drafts.

Project Structure

.
├── extraction-pipeline/    # Entity extraction and Neo4j loading service
│   ├── src/               # Source code for extraction pipeline
│   ├── scripts/           # Utility scripts
│   ├── api.py             # FastAPI server for extraction (port 8000)
│   ├── docker-compose.yml # Neo4j database configuration
│   ├── requirements.txt   # Python dependencies
│   └── venv/              # Virtual environment (created by setup.sh)
│
├── agent/                 # AI networking agent service
│   ├── main.py            # FastAPI server for agent (port 8001)
│   ├── requirements.txt   # Python dependencies
│   ├── conversations.db   # SQLite database for conversation history
│   └── venv/              # Virtual environment (created by setup.sh)
│
├── data/                  # Data directory (optional)
├── logs/                  # Application logs (created by start.sh)
├── pids/                  # Process IDs (created by start.sh)
├── setup.sh               # Setup script - installs dependencies
├── start.sh               # Start script - runs all services
├── stop.sh                # Stop script - stops all services
└── README.md              # This file

Components

1. Extraction Pipeline (extraction-pipeline/)

Purpose: Extracts entities and relationships from meeting transcripts and populates a Neo4j knowledge graph.

Features:

  • LLM-based entity extraction (People, Organizations, Skills, Topics, Events, Projects)
  • Relationship extraction (WORKS_AT, KNOWS, EXPERT_IN, etc.)
  • Multi-pass extraction for improved accuracy
  • Neo4j integration with automatic schema creation
  • REST API for programmatic access

API Endpoints:

  • GET / - API information
  • GET /health - Health check (OpenAI + Neo4j status)
  • POST /extract - Extract entities from markdown transcript

Tech Stack: FastAPI, LangExtract, Neo4j, OpenAI, Pydantic

2. Agent (agent/)

Purpose: Analyzes meeting transcripts to identify networking opportunities and generate personalized email drafts.

Features:

  • AI agent workflow with OpenAI function calling
  • Cypher query generation to search knowledge graph
  • Need/goal/pain point extraction from transcripts
  • Automated connection matching
  • Personalized email draft generation
  • SQLite conversation history

API Endpoints:

  • GET / - API information
  • GET /health - Health check (Neo4j + OpenAI + SQLite status)
  • POST /analyze - Analyze transcript and find connections
  • GET /history - Get conversation history
  • GET /conversations/{id} - Get specific conversation details

Tech Stack: FastAPI, OpenAI, Neo4j, SQLite, Pydantic

3. Neo4j Database

Purpose: Knowledge graph database storing entities and relationships.

Access:

Schema: See extraction-pipeline/SCHEMA.md for detailed node and relationship types.

Installation

Prerequisites

  • Python 3.8 or higher
  • Docker and Docker Compose
  • OpenAI API key

Step 1: Clone and Navigate

cd /Users/gabrielcunha/Documents/hackhaton/implementation-task

Step 2: Run Setup Script

chmod +x setup.sh
./setup.sh

This script will:

  1. Create virtual environments in both extraction-pipeline/ and agent/
  2. Install all Python dependencies
  3. Create a .env.template file

Step 3: Configure Environment

Create .env files in both directories:

extraction-pipeline/.env:

# OpenAI API Configuration
OPENAI_API_KEY=sk-your-actual-api-key-here

# Neo4j Configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=knowledge123

# API Configuration
API_HOST=0.0.0.0
API_PORT=8000

agent/.env:

# OpenAI API Configuration
OPENAI_API_KEY=sk-your-actual-api-key-here

# Neo4j Configuration
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=knowledge123

# Agent Configuration
OPENAI_MODEL=gpt-4o
SQLITE_DB_PATH=conversations.db
MAX_AGENT_ITERATIONS=10

Running the Application

Start All Services

chmod +x start.sh
./start.sh

This will:

  1. Start Neo4j database in Docker
  2. Start Extraction Pipeline API on port 8000
  3. Start Agent API on port 8001

Access the Services

View Logs

# View all logs
tail -f logs/*.log

# View extraction API logs only
tail -f logs/extraction-api.log

# View agent API logs only
tail -f logs/agent-api.log

Stop All Services

chmod +x stop.sh
./stop.sh

This will stop both APIs and the Neo4j container.

Usage Examples

1. Extract Entities from Transcript

curl -X POST "http://localhost:8000/extract" \
  -H "Content-Type: application/json" \
  -d '{
    "markdown": "# Meeting with Sarah Chen\n\nSarah works at Quantum Analytics as Head of Data Science...",
    "meeting_id": "meeting-001",
    "meeting_title": "Networking Chat",
    "extraction_passes": 2
  }'

2. Analyze Transcript for Connections

curl -X POST "http://localhost:8001/analyze" \
  -H "Content-Type: application/json" \
  -d '{
    "transcript": "I discussed my project with Sarah...",
    "user_id": "user123"
  }'

3. Query Neo4j Directly

Open http://localhost:7474 and run:

// Find all people
MATCH (p:Person) RETURN p LIMIT 10

// Find people working at specific company
MATCH (p:Person)-[:WORKS_AT]->(o:Organization {name: "Quantum Analytics"})
RETURN p.name, p.role

// Find expertise connections
MATCH (p:Person)-[r:EXPERT_IN]->(s:Skill)
WHERE r.expertise_level IN ['expert', 'authority']
RETURN p.name, s.name, r.expertise_level

Troubleshooting

Port Already in Use

If ports 7474, 7687, 8000, or 8001 are already in use:

  1. Check what's using the port:

    lsof -i :8000
  2. Kill the process or change the port in the configuration

Docker Issues

# Check Docker is running
docker ps

# Restart Docker service
# On macOS: Restart Docker Desktop
# On Linux: sudo systemctl restart docker

# View Neo4j logs
docker logs knowledge-graph-neo4j

Virtual Environment Issues

# Remove and recreate venv
cd extraction-pipeline
rm -rf venv
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
deactivate

API Not Starting

  1. Check logs: tail -f logs/extraction-api.log
  2. Ensure .env files exist with valid OPENAI_API_KEY
  3. Ensure Neo4j is running: docker ps | grep neo4j

Development

Running Services Manually

Neo4j:

cd extraction-pipeline
docker-compose up -d

Extraction API:

cd extraction-pipeline
source venv/bin/activate
python api.py

Agent API:

cd agent
source venv/bin/activate
python main.py

Testing

Test Extraction API:

cd extraction-pipeline
source venv/bin/activate
python example_api_client.py

Test Agent API:

cd agent
source venv/bin/activate
python test_agent.py

Documentation

  • Extraction Pipeline API: See extraction-pipeline/API_USAGE.md
  • Agent API: See agent/API_DOCUMENTATION.md
  • Neo4j Schema: See extraction-pipeline/SCHEMA.md

License

This is a hackathon project. See individual components for licensing information.

Support

For issues or questions:

  1. Check the logs in logs/ directory
  2. Verify all services are running: docker ps and check API health endpoints
  3. Ensure environment variables are correctly set in .env files

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors