[5.x] Performance Optimizations for Stache and Query Operations #12894

hastinbe · 2025-10-27T20:29:47Z

I've implemented performance optimizations for Statamic's Stache and query operations. Our ongoing proprietary CMS migration, combined with the team's need to load our full dataset, created a severe bottleneck. The Stache warm-up time was reaching 6 minutes, a process we were executing frequently during daily development. These changes focus on reducing redundant operations and improving algorithmic efficiency, resulting in a ~6x faster warm time and a ~50% memory reduction across 16k files.

Issues Addressed

Stache Warming

Items were being loaded and parsed multiple times during warming
File traversal could be more memory efficient
Parallel processing available for CLI operations (using 'fork' driver)

Query Operations

in_array() calls in filterWhereIn() and filterWhereNotIn() were doing linear searches
Some repository methods weren't handling empty inputs efficiently
Dictionary search operations could benefit from hash lookups

Changes Made

Stache Optimizations

Load items once per store instead of once per index
Cache items during paths() resolution to avoid double parsing
Use RecursiveDirectoryIterator for more efficient file traversal
Add parallel processing support for CLI operations (fork driver)
Add new warming configuration options in config/stache.php

Query/Repository Optimizations

Replace in_array() with isset(array_flip()) in filterWhereIn() and filterWhereNotIn()
Add early returns for empty inputs in repository methods
Optimize array operations to avoid unnecessary intermediate collections
Improve BasicDictionary::matchesSearchQuery() with hash lookups

Performance Results

I ran benchmarks on a production dataset with 15,996 files to measure the impact:

Baseline:

Warm Time: 328.45s
Peak Memory: 829.77MB
Cleared: 40.02s

After Optimizations:

Warm Time: 53.07s (~6x faster)
Peak Memory: 416.92MB (~50% reduction)
Cleared: 517.46ms (~77x faster)

Query Operations:

filterWhereIn() with 1000 items: ~78x faster with hash lookups
BasicDictionary::matchesSearchQuery(): 1.29-1.63x faster depending on dataset size

The parallel processing improvement mainly applies to CLI operations using the 'fork' driver, as web requests can't fork processes.

Detailed Benchmark Breakdown

Metric	Baseline	Optimized	Improvement
Warm Time	328.45s	53.07s	~6x faster
Peak Memory	829.77MB	416.92MB	~50% reduction
Cleared	40.02s	517.46ms	~77x faster
Files	15,996	15,996	Same dataset

Optimization Progression

Step	Warm Time	Improvement
Baseline	328.45s	-
Single Item Load	270.79s	17.6% faster
Eliminate Double Parsing	232.15s	29.3% faster
Optimized File Traversal	172.81s	47.4% faster
Process-based + Redis	202.81s	38.3% faster
Concurrency + File	42.05s	87.2% faster

Total Improvement: 286.40s reduction in warm time

Configuration Options

New warming configuration section added to config/stache.php:

'warming' => [
    // Enable parallel store processing for faster warming on multi-core systems
    'parallel_processing' => env('STATAMIC_STACHE_PARALLEL_WARMING', false),

    // Maximum number of parallel processes (0 = auto-detect CPU cores)
    'max_processes' => env('STATAMIC_STACHE_MAX_PROCESSES', 0),

    // Minimum number of stores required to enable parallel processing
    'min_stores_for_parallel' => env('STATAMIC_STACHE_MIN_STORES_PARALLEL', 3),

    // Concurrency driver: 'process', 'fork', or 'sync'
    'concurrency_driver' => env('STATAMIC_STACHE_CONCURRENCY_DRIVER', 'process'),
],

These options allow fine-tuning of the parallel processing behavior based on your server environment and requirements.

Note: The fork driver is not available on Windows systems. On Windows, the system will automatically fall back to the process driver or sequential processing.

Implementation Notes

All changes are backwards compatible with no breaking changes
Existing tests pass without modification
Proper fallbacks are in place for error conditions

- Load items once per store instead of once per index (45% faster warming) - Eliminate double parsing by caching items during paths() resolution - Replace Filesystem::allFiles() with RecursiveDirectoryIterator for better memory efficiency - Add parallel store processing with Laravel Concurrency facade - Fix AggregateStore key format issues and hidden file filtering - Add early returns and filtering optimizations Combined improvements: - 45% faster warming (328s → 181s on 15,996 files) - 83% fewer file operations - Better memory efficiency for large directories - 40-60% faster warming on multi-core systems - Backwards compatible, no breaking changes Signed-off-by: Beau Hastings <beau@saweet.net>

- Replace in_array() with isset(array_flip()) for O(1) hash lookups in filterWhereIn/filterWhereNotIn - Add early returns to Repository methods for empty inputs and null checks - Optimize array filtering operations with direct loops and early exits - Optimize BasicDictionary::matchesSearchQuery() with isset() lookup (1.29-1.63x faster) Key improvements: - O(1) hash table lookups instead of O(n) linear searches - Avoids unnecessary database queries for empty inputs - Reduces memory allocation in array operations - Better scalability for large datasets Most beneficial for: - Bulk actions in Control Panel (publish/delete 50+ entries) - ID-based queries and search result processing - Dictionary searches with many searchable fields - Large dataset filtering operations

hastinbe · 2025-10-27T20:33:38Z

Test script https://gist.github.com/hastinbe/d6ec020c019131c7490623ea217d6755

Edit: accidentally cancelled checks below can someone re-run?

duncanmcclean · 2025-10-27T20:39:03Z

Edit: accidentally cancelled checks below can someone re-run?

Done

Signed-off-by: Beau Hastings <beau@saweet.net>

1) Tests\Stache\Repositories\EntryRepositoryTest::it_gets_entries_by_ids with data set "missing" (['numeric-one', 'unknown', 'numeric-three'], ['One', 'Three']) Failed asserting that two arrays are equal. --- Expected +++ Actual @@ @@ Array ( 0 => 'One' - 1 => 'Three' + 2 => 'Three' ) Signed-off-by: Beau Hastings <beau@saweet.net>

src/Stache/Traverser.php

godismyjudge95 · 2025-10-28T04:11:47Z

First of all this is exactly what I've been wanting to work on but haven't had time, so thank you for getting a functional PR submitted!

I am going to leave a few comments with things I noted/implemented from my testing - feel free to disregard them.

Also, did you do any testing in regards to using a generator in the Traverser and then passing that all the way down? https://www.php.net/manual/en/language.generators.overview.php
I was thinking this would essentially make the entire thing one big loop rather than looping over the paths multiple times. Let me know and I can provide the code I already have for it.

Thanks again for submitting this.

ryanmitchell · 2025-10-28T09:12:43Z

love this, thanks for your work

Avoids using iterator_to_array() to save memory and increase iteration speed. Co-authored-by: Daniel Weaver <godismyjudge95@users.noreply.github.com> Signed-off-by: Beau Hastings <beau@saweet.net>

jasonvarga · 2025-10-28T14:07:46Z

Amazing 🤗

hastinbe · 2025-10-29T20:51:14Z

Revision

Loading all items once upfront broke parent-child relationships. So $item->parent() returned null
RecursiveDirectoryIterator doesn't maintain the display order. Instead of reverting back to Filesystem::allFiles(), I used Symfony's Finder directly to preserve the memory efficiency of an iterator.
If desired, this also means we can drop the Filesystem dependency for the Traverser (which I did)

Even without the Single Item Load improvement, we're still on par with about the same performance

Metric	Baseline	Optimized (Fixed)	Improvement
Warm Time	328.45s	46.00s	~7.1x faster
Peak Memory	829.77MB	401.5MB	~52% reduction
Clear Time	40.02s	562ms	~71x faster
Files	15,996	15,996	Same dataset

This fixes two bugs introduced by performance optimizations: 1. Parent-child relationships: Reverted Store::warm() optimization that loaded all items upfront, which broke parent relationships because items were loaded before the structure tree was built. Restored original behavior where each index loads items individually when parent context is available. 2. Entry ordering: Reverted Traverser to use Finder directly instead of either RecursiveDirectoryIterator or Filesystem::allFiles() to maintain original file traversal order, which affects entry display order in the UI. 3. Cleanup: Removed unused updateFromItems() method from Index class.

src/Stache/Indexes/Index.php

src/Stache/Stache.php

Replace nested arrow functions with traditional closures using explicit use clauses to fix ArgumentCountError when using the 'process' concurrency driver.

Following the removal of ResolveValue usage in 386f51a, the getItemValue method in the Index base class is no longer needed.

jasonvarga · 2025-11-06T21:09:03Z

Thanks for all this. Are you still working on it?

hastinbe · 2025-11-06T21:22:21Z

Thanks for all this. Are you still working on it?

Finished if there are no further suggestions from anyone

hastinbe added 2 commits October 27, 2025 15:18

hastinbe added 2 commits October 27, 2025 20:08

fix: linting issues

a0a6924

Signed-off-by: Beau Hastings <beau@saweet.net>

godismyjudge95 reviewed Oct 28, 2025

View reviewed changes

src/Stache/Traverser.php Outdated Show resolved Hide resolved

perf: optimize file traversal to reduce memory overhead

88f630e

Avoids using iterator_to_array() to save memory and increase iteration speed. Co-authored-by: Daniel Weaver <godismyjudge95@users.noreply.github.com> Signed-off-by: Beau Hastings <beau@saweet.net>

hastinbe marked this pull request as draft October 29, 2025 17:23

hastinbe force-pushed the feature/stache-performance-optimizations branch from cd64746 to 386f51a Compare October 29, 2025 21:08

hastinbe marked this pull request as ready for review November 2, 2025 14:29

jasonvarga reviewed Nov 5, 2025

View reviewed changes

src/Stache/Indexes/Index.php Outdated Show resolved Hide resolved

jasonvarga reviewed Nov 5, 2025

View reviewed changes

src/Stache/Stache.php Outdated Show resolved Hide resolved

src/Stache/Stache.php Outdated Show resolved Hide resolved

jasonvarga added 2 commits November 5, 2025 15:27

handle windows and mac

922f839

guard shell_exec

8cf2356

jasonvarga reviewed Nov 5, 2025

View reviewed changes

src/Stache/Stache.php Show resolved Hide resolved

hastinbe added 2 commits November 6, 2025 14:37

Fix closure serialization for process driver in parallel warming

f1f485d

Replace nested arrow functions with traditional closures using explicit use clauses to fix ArgumentCountError when using the 'process' concurrency driver.

cleanup: remove getItemValue from Index base class

c241001

Following the removal of ResolveValue usage in 386f51a, the getItemValue method in the Index base class is no longer needed.

jasonvarga merged commit 29f52ad into statamic:5.x Nov 10, 2025
24 checks passed

hastinbe mentioned this pull request Dec 5, 2025

[5.x] _indexes files are incorrect when parallel Stache warming #13265

Closed

jasonvarga mentioned this pull request Dec 5, 2025

[5.x] Fix whereNotIn error with nulls #13266

Merged

o1y mentioned this pull request Dec 5, 2025

Parallel Stache warming produces inconsistent Stache _indexes #13261

Open

aerni mentioned this pull request Jan 12, 2026

[5.x] Fix filterWhere with arrays #13507

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[5.x] Performance Optimizations for Stache and Query Operations #12894

[5.x] Performance Optimizations for Stache and Query Operations #12894

Uh oh!

hastinbe commented Oct 27, 2025

Uh oh!

hastinbe commented Oct 27, 2025 •

edited

Loading

Uh oh!

duncanmcclean commented Oct 27, 2025

Uh oh!

Uh oh!

godismyjudge95 commented Oct 28, 2025

Uh oh!

ryanmitchell commented Oct 28, 2025

Uh oh!

jasonvarga commented Oct 28, 2025

Uh oh!

hastinbe commented Oct 29, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jasonvarga commented Nov 6, 2025

Uh oh!

hastinbe commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[5.x] Performance Optimizations for Stache and Query Operations #12894

[5.x] Performance Optimizations for Stache and Query Operations #12894

Uh oh!

Conversation

hastinbe commented Oct 27, 2025

Issues Addressed

Stache Warming

Query Operations

Changes Made

Stache Optimizations

Query/Repository Optimizations

Performance Results

Detailed Benchmark Breakdown

Optimization Progression

Configuration Options

Implementation Notes

Uh oh!

hastinbe commented Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

duncanmcclean commented Oct 27, 2025

Uh oh!

Uh oh!

godismyjudge95 commented Oct 28, 2025

Uh oh!

ryanmitchell commented Oct 28, 2025

Uh oh!

jasonvarga commented Oct 28, 2025

Uh oh!

hastinbe commented Oct 29, 2025

Revision

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jasonvarga commented Nov 6, 2025

Uh oh!

hastinbe commented Nov 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hastinbe commented Oct 27, 2025 •

edited

Loading