Skip to content

Conversation

@hastinbe
Copy link
Contributor

I've implemented performance optimizations for Statamic's Stache and query operations. Our ongoing proprietary CMS migration, combined with the team's need to load our full dataset, created a severe bottleneck. The Stache warm-up time was reaching 6 minutes, a process we were executing frequently during daily development. These changes focus on reducing redundant operations and improving algorithmic efficiency, resulting in a ~6x faster warm time and a ~50% memory reduction across 16k files.

Issues Addressed

Stache Warming

  • Items were being loaded and parsed multiple times during warming
  • File traversal could be more memory efficient
  • Parallel processing available for CLI operations (using 'fork' driver)

Query Operations

  • in_array() calls in filterWhereIn() and filterWhereNotIn() were doing linear searches
  • Some repository methods weren't handling empty inputs efficiently
  • Dictionary search operations could benefit from hash lookups

Changes Made

Stache Optimizations

  • Load items once per store instead of once per index
  • Cache items during paths() resolution to avoid double parsing
  • Use RecursiveDirectoryIterator for more efficient file traversal
  • Add parallel processing support for CLI operations (fork driver)
  • Add new warming configuration options in config/stache.php

Query/Repository Optimizations

  • Replace in_array() with isset(array_flip()) in filterWhereIn() and filterWhereNotIn()
  • Add early returns for empty inputs in repository methods
  • Optimize array operations to avoid unnecessary intermediate collections
  • Improve BasicDictionary::matchesSearchQuery() with hash lookups

Performance Results

I ran benchmarks on a production dataset with 15,996 files to measure the impact:

Baseline:

  • Warm Time: 328.45s
  • Peak Memory: 829.77MB
  • Cleared: 40.02s

After Optimizations:

  • Warm Time: 53.07s (~6x faster)
  • Peak Memory: 416.92MB (~50% reduction)
  • Cleared: 517.46ms (~77x faster)

Query Operations:

  • filterWhereIn() with 1000 items: ~78x faster with hash lookups
  • BasicDictionary::matchesSearchQuery(): 1.29-1.63x faster depending on dataset size

The parallel processing improvement mainly applies to CLI operations using the 'fork' driver, as web requests can't fork processes.

Detailed Benchmark Breakdown

Metric Baseline Optimized Improvement
Warm Time 328.45s 53.07s ~6x faster
Peak Memory 829.77MB 416.92MB ~50% reduction
Cleared 40.02s 517.46ms ~77x faster
Files 15,996 15,996 Same dataset

Optimization Progression

Step Warm Time Improvement
Baseline 328.45s -
Single Item Load 270.79s 17.6% faster
Eliminate Double Parsing 232.15s 29.3% faster
Optimized File Traversal 172.81s 47.4% faster
Process-based + Redis 202.81s 38.3% faster
Concurrency + File 42.05s 87.2% faster

Total Improvement: 286.40s reduction in warm time

Configuration Options

New warming configuration section added to config/stache.php:

'warming' => [
    // Enable parallel store processing for faster warming on multi-core systems
    'parallel_processing' => env('STATAMIC_STACHE_PARALLEL_WARMING', false),

    // Maximum number of parallel processes (0 = auto-detect CPU cores)
    'max_processes' => env('STATAMIC_STACHE_MAX_PROCESSES', 0),

    // Minimum number of stores required to enable parallel processing
    'min_stores_for_parallel' => env('STATAMIC_STACHE_MIN_STORES_PARALLEL', 3),

    // Concurrency driver: 'process', 'fork', or 'sync'
    'concurrency_driver' => env('STATAMIC_STACHE_CONCURRENCY_DRIVER', 'process'),
],

These options allow fine-tuning of the parallel processing behavior based on your server environment and requirements.

Note: The fork driver is not available on Windows systems. On Windows, the system will automatically fall back to the process driver or sequential processing.

Implementation Notes

  • All changes are backwards compatible with no breaking changes
  • Existing tests pass without modification
  • Proper fallbacks are in place for error conditions

- Load items once per store instead of once per index (45% faster warming)
- Eliminate double parsing by caching items during paths() resolution
- Replace Filesystem::allFiles() with RecursiveDirectoryIterator for better memory efficiency
- Add parallel store processing with Laravel Concurrency facade
- Fix AggregateStore key format issues and hidden file filtering
- Add early returns and filtering optimizations

Combined improvements:
- 45% faster warming (328s → 181s on 15,996 files)
- 83% fewer file operations
- Better memory efficiency for large directories
- 40-60% faster warming on multi-core systems
- Backwards compatible, no breaking changes

Signed-off-by: Beau Hastings <beau@saweet.net>
- Replace in_array() with isset(array_flip()) for O(1) hash lookups in filterWhereIn/filterWhereNotIn
- Add early returns to Repository methods for empty inputs and null checks
- Optimize array filtering operations with direct loops and early exits
- Optimize BasicDictionary::matchesSearchQuery() with isset() lookup (1.29-1.63x faster)

Key improvements:
- O(1) hash table lookups instead of O(n) linear searches
- Avoids unnecessary database queries for empty inputs
- Reduces memory allocation in array operations
- Better scalability for large datasets

Most beneficial for:
- Bulk actions in Control Panel (publish/delete 50+ entries)
- ID-based queries and search result processing
- Dictionary searches with many searchable fields
- Large dataset filtering operations
@hastinbe
Copy link
Contributor Author

hastinbe commented Oct 27, 2025

Test script https://gist.github.com/hastinbe/d6ec020c019131c7490623ea217d6755

Edit: accidentally cancelled checks below can someone re-run?

@duncanmcclean
Copy link
Member

Edit: accidentally cancelled checks below can someone re-run?

Done

Signed-off-by: Beau Hastings <beau@saweet.net>
1) Tests\Stache\Repositories\EntryRepositoryTest::it_gets_entries_by_ids with data set "missing" (['numeric-one', 'unknown', 'numeric-three'], ['One', 'Three'])
Failed asserting that two arrays are equal.
--- Expected
+++ Actual
@@ @@
 Array (
     0 => 'One'
-    1 => 'Three'
+    2 => 'Three'
 )

Signed-off-by: Beau Hastings <beau@saweet.net>
@godismyjudge95
Copy link
Contributor

First of all this is exactly what I've been wanting to work on but haven't had time, so thank you for getting a functional PR submitted!

I am going to leave a few comments with things I noted/implemented from my testing - feel free to disregard them.

Also, did you do any testing in regards to using a generator in the Traverser and then passing that all the way down? https://www.php.net/manual/en/language.generators.overview.php
I was thinking this would essentially make the entire thing one big loop rather than looping over the paths multiple times. Let me know and I can provide the code I already have for it.

Thanks again for submitting this.

@ryanmitchell
Copy link
Contributor

love this, thanks for your work

Avoids using iterator_to_array() to save memory and increase iteration speed.

Co-authored-by: Daniel Weaver <godismyjudge95@users.noreply.github.com>
Signed-off-by: Beau Hastings <beau@saweet.net>
@jasonvarga
Copy link
Member

Amazing 🤗

@hastinbe hastinbe marked this pull request as draft October 29, 2025 17:23
@hastinbe
Copy link
Contributor Author

Revision

  • Loading all items once upfront broke parent-child relationships. So $item->parent() returned null
  • RecursiveDirectoryIterator doesn't maintain the display order. Instead of reverting back to Filesystem::allFiles(), I used Symfony's Finder directly to preserve the memory efficiency of an iterator.
  • If desired, this also means we can drop the Filesystem dependency for the Traverser (which I did)

Even without the Single Item Load improvement, we're still on par with about the same performance

Metric Baseline Optimized (Fixed) Improvement
Warm Time 328.45s 46.00s ~7.1x faster
Peak Memory 829.77MB 401.5MB ~52% reduction
Clear Time 40.02s 562ms ~71x faster
Files 15,996 15,996 Same dataset

This fixes two bugs introduced by performance optimizations:

1. Parent-child relationships: Reverted Store::warm() optimization that
   loaded all items upfront, which broke parent relationships because items
   were loaded before the structure tree was built. Restored original
   behavior where each index loads items individually when parent context
   is available.

2. Entry ordering: Reverted Traverser to use Finder directly instead of
   either RecursiveDirectoryIterator or Filesystem::allFiles() to maintain
   original file traversal order, which affects entry display order in the UI.

3. Cleanup: Removed unused updateFromItems() method from Index class.
@hastinbe hastinbe force-pushed the feature/stache-performance-optimizations branch from cd64746 to 386f51a Compare October 29, 2025 21:08
@hastinbe hastinbe marked this pull request as ready for review November 2, 2025 14:29
Replace nested arrow functions with traditional closures using explicit use clauses to fix ArgumentCountError when using the 'process' concurrency driver.
Following the removal of ResolveValue usage in 386f51a, the getItemValue
method in the Index base class is no longer needed.
@jasonvarga
Copy link
Member

Thanks for all this. Are you still working on it?

@hastinbe
Copy link
Contributor Author

hastinbe commented Nov 6, 2025

Thanks for all this. Are you still working on it?

Finished if there are no further suggestions from anyone

@jasonvarga jasonvarga merged commit 29f52ad into statamic:5.x Nov 10, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants