Skip to content

plotnik-lang/plotnik

Repository files navigation



The logo: a curled wood shaving on a workbench

Plotnik


Plotnik is a query language for source code.
Queries extract relevant structured data.
Transactions allow granular, atomic edits.


stable nightly Image Apache-2.0 License


⚠️ ALPHA STAGE: not for production use ⚠️



The problem

Tree-sitter solved parsing. It powers syntax highlighting and code navigation at GitHub, drives the editing experience in Zed, Helix, and Neovim. It gives you a fast, accurate, incremental syntax tree for virtually any language.

The hard problem now is what comes after parsing: extraction of meaning from the tree, and safe transformation back to source:

function extractFunction(node: SyntaxNode): FunctionInfo | null {
  if (node.type !== "function_declaration") {
    return null;
  }
  const name = node.childForFieldName("name");
  const body = node.childForFieldName("body");
  if (!name || !body) {
    return null;
  }
  return {
    name: name.text,
    body,
  };
}

Every extraction requires a new function, each one a potential source of bugs that won't surface until production. And once you've extracted what you need, applying changes back to the source requires careful span tracking, validation, and error handling—another layer of brittle code.

The solution

Plotnik extends Tree-sitter queries with type annotations:

(function_declaration
  name: (identifier) @name :: string
  body: (statement_block) @body
) @func :: FunctionInfo

The query describes structure, and Plotnik infers the output type:

interface FunctionInfo {
  name: string;
  body: SyntaxNode;
}

This structure is guaranteed by the query engine. No defensive programming needed.

Once extraction is complete, Plotnik will support transactions to apply validated changes back to the source. The same typed nodes used for extraction become targets for transformation—completing the loop from source to structured data and back to source.

But what about Tree-sitter queries?

Tree-sitter already has queries:

(function_declaration
  name: (identifier) @name
  body: (statement_block) @body)

The result is a flat capture list:

query.matches(tree.rootNode);
// → [{ captures: [{ name: "name", node }, { name: "body", node }] }, ...]

The assembly layer is up to you:

const name = match.captures.find((c) => c.name === "name")?.node;
const body = match.captures.find((c) => c.name === "body")?.node;
if (!name || !body) throw new Error("Missing capture");
return { name: name.text, body };

This means string-based lookup, null checks, and manual type definitions kept in sync by convention.

Tree-sitter queries are designed for matching. Plotnik adds the typing layer: the query is the type definition.

Why Plotnik?

Hand-written extraction Plotnik
Manual navigation Declarative pattern matching
Runtime type errors Compile-time type inference
Repetitive extraction code Single-query extraction
Ad-hoc data structures Generated structs/interfaces

Plotnik extends Tree-sitter's query syntax with:

  • Named expressions for composition and reuse
  • Recursion for arbitrarily nested structures
  • Type annotations for precise output shapes
  • Alternations: untagged for simplicity, tagged for precision (discriminated unions)

Use cases

  • Scripting: Count patterns, extract metrics, audit dependencies
  • Custom linters: Encode your business rules and architecture constraints
  • LLM Pipelines: Extract signatures and types as structured data for RAG
  • Code Intelligence: Outline views, navigation, symbol extraction across grammars

Language design

Plotnik builds on Tree-sitter's query syntax, extending it with the features needed for typed extraction:

Statement = [
  Assign: (assignment_expression
    left: (identifier) @target :: string
    right: (Expression) @value)
  Call: (call_expression
    function: (identifier) @func :: string
    arguments: (arguments (Expression)* @args))
]

Expression = [
  Ident: (identifier) @name :: string
  Num: (number) @value :: string
]

TopDefinitions = (program (Statement)+ @statements)

This produces:

type Statement =
  | { $tag: "Assign"; $data: { target: string; value: Expression } }
  | { $tag: "Call"; $data: { func: string; args: Expression[] } };

type Expression =
  | { $tag: "Ident"; $data: { name: string } }
  | { $tag: "Num"; $data: { value: string } };

type TopDefinitions = {
  statements: [Statement, ...Statement[]];
};

Then process the results:

for (const stmt of result.statements) {
  switch (stmt.$tag) {
    case "Assign":
      console.log(`Assignment to ${stmt.$data.target}`);
      break;
    case "Call":
      console.log(
        `Call to ${stmt.$data.func} with ${stmt.$data.args.length} args`,
      );
      break;
  }
}

For the detailed specification, see the Language Reference.

Supported Languages

Plotnik uses arborium, a batteries-included tree-sitter grammar collection with 60+ permissively-licensed languages out of the box.

Roadmap

Ignition: the parser ✓

The foundation is complete: a resilient parser that recovers from errors and keeps going.

  • Resilient parser
  • Rich diagnostics
  • Name resolution
  • Recursion validation
  • Structural validation

Liftoff: type inference

The schema infrastructure is built. Type inference is next.

  • node-types.json parsing and schema representation
  • Proc macro for compile-time schema embedding
  • Statically bundled languages with node type info
  • Query validation against language schemas (unstable)
  • Type inference

Acceleration: query engine

  • Runtime execution with backtracking cursor walker
  • Query IR
  • Advanced validation powered by grammar.json (production rules, precedence)
  • Match result API with typed accessors

Orbit: developer experience

The CLI foundation exists. The full developer experience is ahead.

  • CLI framework with debug, docs, langs, exec, types commands
  • Query inspection: AST dump, symbol table, node arities, spans, transition graph, inferred types
  • Source inspection: Tree-sitter parse tree visualization
  • Execute queries against source code and output JSON (exec)
  • Generate TypeScript types from queries (types)
  • CLI distribution: Homebrew, cargo-binstall, npm wrapper
  • Compiled queries via Rust proc macros (zero-cost: query → native code)
  • Language bindings: TypeScript (WASM), Python, Ruby
  • LSP server: diagnostics, completions, hover, go-to-definition
  • Editor extensions: VS Code, Zed, Neovim

Acknowledgments

Max Brunsfeld created Tree-sitter; Amaan Qureshi and other contributors maintain the parser ecosystem that makes this project possible.

License

This project is licensed under the Apache License (Version 2.0).

About

Typed queries + atomic edits for Tree-sitter.

Resources

License

Contributing

Stars

Watchers

Forks

Contributors 2

  •  
  •