Skip to content
View levalencia's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report levalencia

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
levalencia/README.md

Luis Valencia

Microsoft MVP AI (2015-2025) | Author: "Mastering Scikit-Learn and PyTorch" | Senior AI Engineer | Multi-Agent Systems

Image Image levalencia


🔥 Currently Working On

🧠 Production AI Systems (Professional Work)

Project Type Business Problem Solved Tech Stack
Hybrid Redaction Engine Privacy compliance - automatically detect and redact PII, legal, financial entities from documents with 99%+ accuracy using combined Regex + LLM reasoning Azure OpenAI, Regex, Docker, Azure DevOps, CI/CD
Agentic RAG Orchestrator Enterprise knowledge access - autonomous multi-agent system that retrieves, ranks, and synthesizes answers from private knowledge bases OpenAI Agents SDK, FastAPI, Azure AI Search, Redis, Semver
Multi-Agent PDF Pipeline Data extraction from complex documents - transforms unstructured biotech PDFs (clinical trials, regulatory filings) into structured datasets for analysis Azure Document Intelligence, Azure OpenAI, Databricks, Docker
Editorial AI Platform Publishing automation - real-time document redaction for editorial teams with sub-second responsiveness SvelteKit, FastAPI, Docker, Azure DevOps, CI/CD

Engineering Practices Across All Projects:

  • ✅ CI/CD Pipelines (Azure DevOps)
  • ✅ Semantic Versioning with automated git tagging
  • ✅ Docker containerization (ACR deployment)
  • ✅ Infrastructure-as-Code
  • ✅ Clean Architecture (Domain-first, SOLID principles)
  • ✅ OpenTelemetry observability

🛠️ Personal Projects

Project Problem Solved Tech
HarmoniqHub Music organization for DJs: manual playlist building is time-consuming, inconsistent track ordering, no intelligent suggestions. Solves: AI-powered playlist generation, automatic track ordering (energy/key compatibility), wave visualization, set management, duplicate detection via acoustic fingerprinting SwiftUI, SwiftData, CoreML, Chromaprint, Azure Table Storage
SuperTradingGodMode Trading strategy optimization is what hyperparameter tuning is to ML - manual backtesting is slow, prone to overfitting. Solves: Parameterized strategy definition, automated walk-forward optimization, anti-lookahead validation, IS/OOS regime detection React, TypeScript, FastAPI, Redis, RQ, Parquet, Pytest, Docker
apple-silicon-llm-stack Want to run LLaMA/Mistral locally on Mac but inference is slow, context windows are limited. Solves: Custom Metal GPU shaders for 8x speedup, Q4 quantization for 70B models in 24GB RAM, LoRA fine-tuning via MLX Python, MLX, C++, Metal, Go, SvelteKit
DidListen Want to track what you listen during the day but existing apps don't capture speaker context. Solves: Real-time speech-to-text, speaker identification, turn detection for meeting notes, voice activity detection Swift 6, WhisperKit, ShazamKit, Clean Architecture

🛠️ Tech Stack

Python PyTorch MLX LangChain HuggingFace vLLM Go Swift CoreML Azure FastAPI React TypeScript Redis

Python | PyTorch | MLX | LangChain | HuggingFace | vLLM | Go | React | TypeScript | FastAPI | Swift | CoreML | Azure ML | Redis


📂 Featured Projects

🏆 apple-silicon-llm-stack

Problem: Cloud LLM inference is expensive ($/hour), running locally on Mac is slow, context windows are limited, and 70B models require expensive GPUs. Solution: Hardware/software co-design for Apple Silicon — runs 70B models in 24GB RAM with extreme optimization.

Technical Implementation

  • Custom Metal Shaders — CUDA-equivalent GPU compute kernels for 8x inference speedup
  • Q4 Quantization — compresses 70B model to fit in 24GB unified memory
  • LoRA/QLoRA Fine-tuning — 99% memory reduction via MLX
  • Go API Gateway — sub-millisecond latency with CGO zero-copy bridge
  • SvelteKit Telemetry UI — real-time SSE streaming dashboard

🧠 HarmoniqHub (macOS App)

Problem: DJs waste hours manually organizing playlists, inconsistent track ordering, no intelligent suggestions, can't visualize energy/waves for sets. Solution: AI-powered playlist generation with energy/key compatibility, automatic track ordering, and wave visualization.

Technical Implementation

  • AI Playlist Generation — intelligent curation based on energy, key (Camelot wheel), and mood
  • CoreML Classification — genre, mood, theme detection trained on 500K+ tracks
  • Wave Visualization — real-time audio waveform display for set planning
  • ShazamKit + Chromaprint — acoustic fingerprinting for duplicate detection
  • Meta-tagging Automation — automatic album/artist/label cleaning
  • Azure Table Storage — <100ms cache response times

⚡ SuperTradingGodMode

Problem: Manual trading backtesting is slow, prone to overfitting, and lacks proper IS/OOS validation — exactly what hyperparameter tuning is to ML. Solution: Parameterized strategy definition with automated walk-forward optimization and anti-lookahead validation.

Technical Implementation

  • Frontend: React · TypeScript · Vite · lightweight-charts · TanStack Query · Zustand
  • Backend: FastAPI · Pydantic v2 · SQLAlchemy
  • Data: Parquet · Redis · RQ worker
  • Infrastructure: Docker Compose · Pytest · Vitest
  • Architecture: Clean Architecture (Domain-first, SOLID principles)
  • Validation: Anti-lookahead backtesting, walk-forward IS/OOS sweep mode, 36+ passing tests

🎧 DidListen (iOS App)

Problem: Existing "what you listened" apps don't capture speaker context — just audio. Want to know WHO spoke, WHEN, and WHAT. Solution: Real-time speaker identification with turn detection — hybrid STT pipeline on device.

Technical Implementation

  • Speech-to-Text Pipeline (3 backends, switchable at runtime):

    • Local: Whisper (Tiny 39MB / Base 150MB / Medium 500MB via WhisperKit)
    • Cloud: Azure AI Speech API
    • On-device: Apple SFSpeechRecognizer
  • Voice AI Features:

    • VAD — RMS energy threshold with state machine
    • Turn Detection — SILENCE ↔ SPEAKING transition detection
    • Speaker Recognition — 256-dim deterministic embeddings + cosine matching
    • Pre-roll Buffer — 1-second circular buffer captures utterance start
  • Architecture:

    • Swift 6 Strict Concurrency (async/await, @MainActor, Sendable)
    • Clean Architecture (Domain/Data/Presentation layers)
    • MVVM + ObservableObject pattern
    • SwiftData persistence

🧩 agent-god-mode

Problem: Loading 2,300+ AI skills into an agent causes "Death by a Thousand Skills" — 30K+ tokens before you type anything, skyrocketing API costs, confusing responses. Solution: Local RAG with just-in-time skill injection — searches vault, retrieves only what it needs, saves 30K+ tokens per session.

Technical Implementation

  • Local Embeddings: @xenova/transformers (CPU-only, no API keys required)
  • RAG Architecture: Isolated skill vault hidden from agent's default prompt
  • Search: Background Node.js worker calculates cosine similarity outside sandbox
  • Just-in-Time Injection: Agent searches → gets top 3 matches → reads only relevant SKILL.md
  • Integration: OpenCode and Claude Code compatible
  • Scale: 2,300+ curated skills (Azure, ML, DevOps, etc.)

🔍 DuplicateFinder

Problem: Simple filename matching misses true duplicates — renamed tracks, resized images, recompressed files are invisible. Solution: Multi-modal AI fingerprinting — acoustic + visual similarity detection beyond filenames.

Technical Implementation

  • Audio: Chromaprint acoustic fingerprinting (matches even with different filenames/bitrates)
  • Images:
    • pHash — perceptual hashing for near-identical images
    • CLIP (ViT-L/14) — vision-language model for semantic similarity ("this photo looks like that one")
  • Vector Search: FAISS for billion-scale similarity in milliseconds
  • UI: Streamlit interactive dashboard with side-by-side preview
  • Safety: One-click undo, trash manager, exportable reports

📫 Contact

📝 Blog: medium.com/@luisevalencia
💼 Career: linkedin.com/in/levalencia

Image

Image

Pinned Loading

  1. agent-god-mode agent-god-mode Public

    Python 1 1

  2. apple-silicon-llm-stack apple-silicon-llm-stack Public

    Python

  3. HarmoniqHub HarmoniqHub Public

    Discussions and Feature Request for HarmoniqHub Apps