PlugMem

PlugMem is a plug-and-play long-term memory system for LLM agents. Instead of storing and retrieving raw interaction histories, PlugMem organizes experience into compact, reusable knowledge units, allowing agents to recall what matters with minimal context overhead.

The module is task-agnostic by design and can be integrated into existing agent pipelines with minimal effort, serving as a general memory backbone for diverse environments such as dialogue agents, knowledge-intensive QA, and web automation.

For more details, please see the full paper: https://arxiv.org/abs/2603.03296

Features

Memory

Three Memory Types:
- Semantic (facts, concepts): User preferences, factual information
- Procedural (workflows, procedures): How-to knowledge, step-by-step processes
- Episodic (interaction sequences): Long interaction sessions stored on disk, referenced by ID
Graph Structure: Maintain hierarchical knowledge units to illustrate the relationship between memories.
LLM Enhancement: Use LLMs for intelligent knowledge extraction, memory retrieval, and reasoning
Memory Compression and Evolution: Naively support updating and evolving the memory graph.

Plug-in

Enhance your agent with 6 lines of code

# init PlugMem memory graph
mg = MemoryGraph()
# init memory sequence
mem = Memory(...)
mem.append(...)
mem.close()
# insert memory sequence into memory graph
mg.insert(mem)
# retrieve memory and perform reasoning on retrieved nodes
mg.retrieve_and_reason(...)

Easy to modify: Apply adaptive strategies by defining different value functions and reasoning prompts.

Installation

Install benchmarks in src/ and follow their installation docs to set up the environment.
Install/upgrade openai==2.6.1.
Additional modifications:

WebArena

# under src/
cd src
# clone modified AgentOccam
git clone https://github.com/jizej/AgentOccam
# clone 
git clone https://github.com/web-arena-x/webarena
# Enable Scriptbrowserenv to run under async loop (if needed)
cp src/webarena_patch/envs.py src/webarena/browser_env/envs.py
# Enable OPENAI_API_KEY + AZURE_ENDPOINT for trajectory evaluation (if needed)
cp src/webarena_patch/openai_utils.py src/webarena/llms/providers/openai_utils.py

Quick Start

Export Parameters

export OPENAI_API_KEY=<your_openai_api_key>
export AZURE_ENDPOINT=<your_azure_endpoint>
export DIR_PATH="/<your_path_to_PlugMem>/data"
export QWEN_BASE_URL="http://<your_qwen_host>:8000/v1"
export EMBEDDING_BASE_URL="http://<your_embedding_host>:8001/v1/embeddings"

Host local inference servers (Qwen + Embedding)

cd host_local_inference
# Qwen (vLLM) server
bash vllm_deploy.sh
# NV-Embed-v2 server
bash nv_embed_v2_deploy.sh

Make needed directory

mkdir -p "$DIR_PATH/episodic_memory" \
         "$DIR_PATH/semantic_memory" \
         "$DIR_PATH/procedural_memory" \
         "$DIR_PATH/tag" \
         "$DIR_PATH/subgoal"

Run examples for different benchmarks
WebArena
```
cd src/eval/webarena
python eval_agentoccam.py
```
Options for eval_agentoccam.py:
- --config: Path to the YAML config file (required).
- --replay-trajectory/--no-replay-trajectory: Replay a saved trajectory before evaluation.
- --trajectory-dir: Directory containing trajectory JSON files for replay.
- --load_memory_graph/--no-load_memory_graph: Load a persisted memory graph from disk.
- --refresh-embeddings/--no-refresh-embeddings: Refresh embeddings when loading the memory graph.
- --read-only-memory/--no-read-only-memory: Use the memory graph without inserting new memories.
- --disable-memory-graph/--no-disable-memory-graph: Turn off all memory-graph operations.
LongMemEval
```
cd src/eval/longmemeval
python eval_longmemeval_all.py
```
HotpotQA(HippoRAG 2)
```
cd src/eval/hotpotqa
# It may take several hours to structure memory for hotpotqa_corpus.json.
python build.py
#Rebuild the memory graph from structuring result and run test
python eval_hotpotqa_all.py
```

Reproducibility

We release agent trajectories and memory graph artifacts for all three tasks.
We release human demonstrations used for WebArena (Under License CC BY 4.0).
Data available in Google Drive: https://drive.google.com/drive/folders/15feC6xYsONJhJAb2n1kPjGrjSt0weHXi?usp=sharing

Citation

If you use our code or data, or otherwise found our work helpful, please cite our paper:

@misc{yang2026plugmemtaskagnosticpluginmemory,
      title={PlugMem: A Task-Agnostic Plugin Memory Module for LLM Agents}, 
      author={Ke Yang and Zixi Chen and Xuan He and Jize Jiang and Michel Galley and Chenglong Wang and Jianfeng Gao and Jiawei Han and ChengXiang Zhai},
      year={2026},
      eprint={2603.03296},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2603.03296}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
host_local_inference		host_local_inference
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PlugMem

Features

Memory

Plug-in

Installation

Quick Start

WebArena

LongMemEval

HotpotQA(HippoRAG 2)

Reproducibility

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

PlugMem

Features

Memory

Plug-in

Installation

Quick Start

WebArena

LongMemEval

HotpotQA(HippoRAG 2)

Reproducibility

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages