graffiti-creator

Contribute »

Inspiration

We want to address loneliness and mental health by creating online accountability through an AI wellness recommendation engine. This platform guides users in self-care while connecting them in real life to a supportive community of like-minded individuals, fostering in-person accountability and mutual growth.

What it does

Seahorse is a decentralized wellness AI with privacy as our core ideology. Here is how it works:

Decentralization of Data Sources

  • Privacy and Control: Reduces dependency on centralized servers, allowing users to retain control over personal data like emails, calendars, and documents.
  • Enhanced Security: Decentralized nodes increase resilience and data security.

Open-Source Language Model Integration

  • Transparency: Enables customizable, transparent models that avoid reliance on third-party, black-box solutions.
  • Privacy-By-Design: Aligns with privacy-first principles, ensuring model processing occurs securely on-device.

Blockchain-Based Compensation with NEAR Protocol

  • Automated Fair Compensation: Data providers receive payment based on data relevance and usage, fostering transparency and trust.
  • Real-Time Accountability: Tracks data interactions on-chain, ensuring tamper-proof, accountable compensation.

User Privacy and Data Ownership

  • Access Control: Users retain granular control over shared data (e.g., email, calendar).
  • On-Device Processing: Minimizes exposure to external systems, enhancing compliance with data privacy regulations.
  • Empowerment Through Privacy: Privacy-by-design principles encourage user trust by ensuring data ownership and control.

Ethical AI Development

  • Fair Compensation and Data Ownership: Sets a standard for ethical AI by respecting data ownership and fairly compensating providers.
  • User Privacy and Functionality: Balances AI functionality with strict data privacy, promoting responsible AI use.
  • Check it out yourself if you don't believe us! https://github.com/KonferCA/Seahorse

Scalability and Flexibility

  • Modular Design: Accommodates various data sources and modalities, including multimodal capabilities.
  • Scalability: Designed to scale with more extensive datasets and data types, increasing versatility.

Foundation for Future Innovations

  • Decentralized AI Proof of Concept: Paves the way for future decentralized AI applications centered around data privacy and ownership.
  • Open Infrastructure Development: Provides a foundation for projects focused on user privacy, data control, and decentralized compute.

How we built it

A decentralized Retrieval-Augmented Generation (RAG) system with a blockchain-based compensation model that prioritizes data ownership and privacy. The system processes personal data, such as Google Calendar entries, emails, and transcriptions, entirely on-device using advanced similarity search with all-MiniLM-L6-v2 embeddings. A secure, wallet-based routing system enables two users to exchange decryption keys, allowing access to shared calendar and event data stored encrypted on NEAR Protocol’s blockchain. Compensation for data providers is managed through NEAR smart contracts, which distribute payments based on data relevance scores. Initially centralized for simplicity, the routing system is designed with future decentralization in mind.

This project aims to set new standards in ethical AI by combining privacy-focused, decentralized data processing with a fair, automated compensation structure for data contributors.

On-Device System

  • On-Device RAG System: An entirely on-device Retrieval-Augmented Generation (RAG) system pulls relevant information from diverse sources.
    • Data Sources: Includes Google Calendar, Gmail emails, uploaded documents, voice transcriptions, and more.
    • Vector Embeddings: Uses all-MiniLM-L6-v2 model to map text into a dense vector space for similarity searches.
    • Flexibility: Designed for plug-and-play functionality to accommodate new information as needed.
  • Response Generation: Uses WebLLM to run Phi 3.5 Vision Instruct on-device for responses, pulling only relevant data.
    • Smart Contract Integration: Relevancy scores and data usage are logged for fair compensation calculation. This information is sent to NEAR Protocol, which calculates and distributes payments based on data relevance.
  • Voice Transcription: Uses Web Speech API for on-device transcription, storing notes locally for user retrieval and RAG system reference.

Smart Contract System

  • Data Management: Data providers manage their information directly on NEAR’s blockchain.
    • Direct Updates: Providers can update, add, or remove data on the blockchain, formatted specifically for the RAG system.
  • Compensation Calculation: Smart contract computes compensation based on data type and relevance scores, immediately distributing NEAR tokens to providers’ wallets.
  • Encrypted User Data Storage: Stores user information (calendar, documents, etc.) in encrypted form. Decryption occurs solely on-device, enabling private data sharing within the RAG system.
    • Shared Events: Includes multi-user interactions like shared calendar events, enabling better decision-making between parties.

Secure Routing System for Key Exchange

  • Allows two users to establish a secure key exchange to decrypt calendar or event data stored on the NEAR blockchain.
  • Mechanism:
    • NEAR Wallet-Based Identification: Each user is identified by their NEAR wallet, establishing a unique route for connection.
    • Key Exchange: Users initiate a key exchange over a secure, temporary routing channel that connects their wallets. Once exchanged, each user holds a key to access shared calendar or event data.
    • Encrypted Storage and Access: After the key exchange, calendar or event data stored on-chain remains encrypted. Only the involved users, who possess the corresponding keys, can decrypt the shared data on-device.
  • Centralized Routing with Future Decentralization Potential:
    • Current Setup: The routing system is centrally managed to streamline initial development, with key exchanges handled through secure server channels linked to NEAR wallets.
    • Decentralization Plan: In future iterations, the routing can transition to decentralized protocols (e.g., peer-to-peer or DHT-based systems), allowing users to securely exchange keys without centralized routing.

Challenges we ran into

  • Running LLMs on device: this was our first challenge we stumbled across early into development. Running LLMs on device was tricky, especially on a browser. We explored multiple options from Hugging Face. However, the models would either be too small to properly utilize the given data or too heavy to run on device. We settled down with using LangChainJS with WebLLM running Phi3.5 vision for a good middle ground.
  • Integrating RAG on device: during the early stages of development, we just had a good enough working RAG system, that would satisfy the storage and retrieval of unstructured data. When more data is fed into the system, it would give return unreliable results. To tackle this problem, we instead used a library called voy-search.
  • Fair compensation: this might be one of the trickiest problems. How much should they be compensated? How do you value the data? Is data A more valuable than data B? How do we know what pieces of data belongs to who? We used smart contracts and the blockchain to verify the authenticity of each data provider. This approach not only ensures transparency in data ownership but also facilitates a more equitable distribution of compensation based on the unique value each data set contributes. By leveraging smart contracts, we can automate the compensation process, allowing for real-time payments as data is utilized, thus reflecting its true market value.

Accomplishments that we're proud of

  • Local Operation of Seahorse: One of our standout achievements is developing Seahorse to run entirely locally, eliminating the need for centralized servers. This design enhances privacy and security for users. By decentralizing the infrastructure, we empower users to maintain control over their data, fostering a trust-based relationship with the application.
  • Rapid Development of a Decentralized Wellness AI: In under 10 days, we successfully launched a web application focused on wellness, demonstrating our team's agility and expertise. This application is uniquely designed to prioritize user privacy, ensuring that sensitive health data remains confidential and secure.

What we learned

  • Now a days, people have access to powerful devices on hand. That power can help you connect with others and building a trust-based relationship while keeping your data secure on device.
  • There are lots of LLM models out there and each has its own use case thanks to the big community.
  • NEAR has a developer-friendly environment and robust documentation which significantly helped with the development and deployment of smart contracts for Seahorse.
  • RUST > TS
  • NEOVIM > VSCODE

What's next for Seahorse

  • Adaptive Compensation Algorithms: Develop sophisticated algorithms for determining data relevance and compensation, incorporating factors like data freshness and user feedback to create a dynamic model that reflects the true value of contributions.

  • Enhanced Connectivity: By creating a network where users can directly connect and share data (peer-to-peer), we promote a sense of community and collaboration. Users can learn from each other, share experiences, and build relationships based on mutual interests, which can lead to greater engagement with the platform.

  • Offload Computation: Leverage the power of decentralized compute to offload computation from on device by implementing advanced encryption techniques to further protect user data during processing, ensuring sensitive information remains secure even in a decentralized compute ecosystem, this ensuring that data ownership remains with the user.

Built With

+ 122 more
Share this project:

Updates