TYPHON

Hierarchical memory research harness for long-context and cross-episode LLM evaluation.

TYPHON is currently a structured research substrate, not yet a learned state-of-the-art memory model. The repo already supports:

benchmark registration and local benchmark-pack ingestion
a heuristic typhon_v0 memory-selection pipeline
a local exact baseline plus a WSL-backed Gated DeltaNet baseline
model-backed evaluation through LM Studio and OpenAI-compatible servers
reproducible configs, runbooks, and experiment tracking

Start Here

Repo Layout

src/typhon: package code, CLI, benchmarks, baselines, memory logic, inference, evaluation
configs: benchmark, runtime, baseline, and live-eval configuration
data: benchmark packs, imports, and normalized local sample assets
results: generated artifacts and evaluation outputs
scripts: wrapper scripts, mostly evaluation and WSL runtime helpers
docs: architecture docs, ADRs, project state, runbooks, research notes, archive
third_party: isolated external code or cloned repos
notebooks: exploratory analysis only

Common Commands

uv run typhon list-benchmarks
uv run typhon list-baselines
uv run typhon validate-benchmark-pack --benchmark longbench
uv run typhon evaluate-memory-suite --backend lmstudio_local --model qwen3.5-9b-vlm --benchmark longbench_v2 --benchmark locomo --benchmark locomo_plus --benchmark memorybench --benchmark evo_memory --sample-source local --chunk-size 24 --local-window-tokens 24 --request-timeout-seconds 600
uv run typhon run-baseline --baseline gated_deltanet_fla --benchmark longbench --sample-source local --sample-limit 1

Working Conventions

Use uv run ... for repo commands and uv lock after dependency changes.
Keep major architectural or process decisions in ADRs, not in ad hoc notes.
Put new docs into the structured docs tree. Do not add new flat markdown files at the repo root or docs/ root unless they are indexes.
Treat results/ as generated output, not hand-maintained source material.

Current Status

The repo has one upstream benchmark adapter live today:

LongBench English-task import via Hugging Face into manifest-backed local packs

The current implemented baselines are:

attention_baseline
gated_deltanet_fla

The current primary live model path is:

lmstudio_local with qwen3.5-9b-vlm

The canonical project state trackers are:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TYPHON

Start Here

Repo Layout

Common Commands

Working Conventions

Current Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
data		data
docs		docs
notebooks		notebooks
results		results
scripts		scripts
src/typhon		src/typhon
third_party		third_party
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

TYPHON

Start Here

Repo Layout

Common Commands

Working Conventions

Current Status

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages