Skip to content

CLI: Add log rotation for process manager children logs#3132

Merged
bcotrim merged 5 commits into
trunkfrom
stu-1419-implement-log-rotation-for-pm-children-logs
Apr 21, 2026
Merged

CLI: Add log rotation for process manager children logs#3132
bcotrim merged 5 commits into
trunkfrom
stu-1419-implement-log-rotation-for-pm-children-logs

Conversation

@bcotrim

@bcotrim bcotrim commented Apr 17, 2026

Copy link
Copy Markdown
Contributor

Related issues

How AI was used in this PR

Built collaboratively via Claude Code. I iterated with the assistant on the design (migration mechanism vs. daemon-side logic, mtime vs. date-tag as the pruning signal, legacy file handling), reviewed each proposed change, and verified the implementation and tests before each iteration. Tests, eslint, and typecheck were run after each change.

Proposed Changes

The PM daemon was appending stdout/stderr from Playground CLI child processes to ~/.studio/daemon/logs/{name}-{out|error}.log forever, with no rotation or cleanup. This change bounds the log dir via two cooperating pieces:

1. Daemon — dated log filenames

  • New log streams are opened at {name}-{out|error}-YYYYMMDD.log, where YYYYMMDD is the date at session start. Same-day restarts append to the existing file; next-day starts create a fresh file, leaving yesterday's file on disk to age out naturally.

2. prunePmLogs() sweep (apps/cli/lib/prune-pm-logs.ts)

  • Called from the CLI's yargs middleware (same place migrations run) on every studio command.
  • Prunes any log-pattern file ({name}-{out|error}(-YYYYMMDD)?.log) whose mtime is older than 14 days. Uniform rule for both legacy undated files and dated files.
  • Uses mtime (not the filename date tag) as the prune signal, so actively-written files (e.g. a long-running session that spans multiple days) are never deleted — safe on Windows, safe for continuously-running sites.
  • Files that don't match the log pattern are left alone.

Files touched

  • apps/cli/process-manager-daemon.ts — dated log filenames
  • apps/cli/lib/prune-pm-logs.ts — single prunePmLogs() function
  • apps/cli/lib/tests/prune-pm-logs.test.ts — unit tests (6 cases)
  • apps/cli/index.ts — invoke prunePmLogs() in the yargs middleware

Review changes

  • Moved prune out of migrations/ and into apps/cli/lib/ as a plain function — it's really a cron-shaped task, not a one-shot migration.
  • Dropped the daemon-side legacy-file adoption (renameLegacyLogIfPresent) — the mtime-based sweep already handles legacy files correctly, and that code path becomes dead within 14 days of merge.
  • Collapsed the two-phase needsToRun/run split into a single pass (one readdir + stats, one unlink loop).

Known limitations / follow-ups

  • No mid-session daily rollover. A site running continuously for weeks stays in its session-start-date file. Safe (mtime gate prevents deletion) but not pretty. Could add a daemon timer in a follow-up if we find this matters in practice.
  • Retention is hardcoded at 14 days. Easy to expose via env/config if we ever need tuning.
  • Cronjob interface generalization. prunePmLogs() and the existing updateServerFiles() are both cron-shaped tasks. A follow-up could extract a shared interface with per-task frequency and last-run timestamps in cli.json.

Testing Instructions

Automated

npm test -- apps/cli/lib/tests/prune-pm-logs.test.ts  # 6/6 pass
npm test -- apps/cli/tests/daemon.test.ts             # passes
npm --workspace apps/cli run typecheck                # clean

Manual smoke tests

Build the CLI first:

npm run cli:build

Use an isolated logs dir to keep your real ~/.studio untouched:

export STUDIO_PROCESS_MANAGER_HOME=/tmp/studio-smoketest
rm -rf "$STUDIO_PROCESS_MANAGER_HOME"
mkdir -p "$STUDIO_PROCESS_MANAGER_HOME/logs"

1. Dated filenames on fresh start

node apps/cli/dist/cli/main.mjs site start <some-site>
ls -la "$STUDIO_PROCESS_MANAGER_HOME/logs/" | grep <site-uuid>

Expect: *-out-YYYYMMDD.log and *-error-YYYYMMDD.log with today's date. Each log line is prefixed with an ISO timestamp.

2. Pruning sweep (mixed stale + fresh + non-log)

# Seed the dir with a mix of files
echo stale  > "$STUDIO_PROCESS_MANAGER_HOME/logs/site-out-20230101.log"
touch -t $(date -v-30d +%Y%m%d)0000 "$STUDIO_PROCESS_MANAGER_HOME/logs/site-out-20230101.log"
echo legacy > "$STUDIO_PROCESS_MANAGER_HOME/logs/old-site-error.log"
touch -t $(date -v-30d +%Y%m%d)0000 "$STUDIO_PROCESS_MANAGER_HOME/logs/old-site-error.log"
echo active > "$STUDIO_PROCESS_MANAGER_HOME/logs/active-site-out-20250101.log"
touch -t $(date -v-1d  +%Y%m%d)0000 "$STUDIO_PROCESS_MANAGER_HOME/logs/active-site-out-20250101.log"
echo keep   > "$STUDIO_PROCESS_MANAGER_HOME/logs/README.txt"
touch -t $(date -v-30d +%Y%m%d)0000 "$STUDIO_PROCESS_MANAGER_HOME/logs/README.txt"

# Trigger the middleware via any real subcommand (--version/--help short-circuit yargs and skip middleware)
node apps/cli/dist/cli/main.mjs site list > /dev/null

ls "$STUDIO_PROCESS_MANAGER_HOME/logs/"

Expect after the run:

  • site-out-20230101.logdeleted (stale dated)
  • old-site-error.logdeleted (stale legacy undated)
  • active-site-out-20250101.logpreserved (fresh mtime despite ancient filename tag)
  • README.txtpreserved (not a log-pattern file)

3. Same-day restart appends

node apps/cli/dist/cli/main.mjs site stop  <site>
node apps/cli/dist/cli/main.mjs site start <site>
ls "$STUDIO_PROCESS_MANAGER_HOME/logs/" | grep <uuid>

Expect: one *-out-YYYYMMDD.log per stream; file size grew. No -001 suffix.

4. Next-day start creates a fresh file
Simulate by creating a file with yesterday's tag, then start a site today:

touch "$STUDIO_PROCESS_MANAGER_HOME/logs/<uuid>-out-$(date -v-1d +%Y%m%d).log"
node apps/cli/dist/cli/main.mjs site start <site>
ls "$STUDIO_PROCESS_MANAGER_HOME/logs/" | grep <uuid>

Expect: both yesterday's and today's files coexist; today's was just created.

Cleanup: rm -rf $STUDIO_PROCESS_MANAGER_HOME

Pre-merge Checklist

  • Have you checked for TypeScript, React or other console errors?

@bcotrim bcotrim self-assigned this Apr 18, 2026
@bcotrim bcotrim marked this pull request as ready for review April 20, 2026 07:06
@bcotrim bcotrim requested a review from a team April 20, 2026 07:06
@wpmobilebot

wpmobilebot commented Apr 20, 2026

Copy link
Copy Markdown
Collaborator

📊 Performance Test Results

Comparing 774b913 vs trunk

app-size

Metric trunk 774b913 Diff Change
App Size (Mac) 1490.95 MB 1490.96 MB +0.00 MB ⚪ 0.0%

site-editor

Metric trunk 774b913 Diff Change
load 1905 ms 1582 ms 323 ms 🟢 -17.0%

site-startup

Metric trunk 774b913 Diff Change
siteCreation 8138 ms 8101 ms 37 ms ⚪ 0.0%
siteStartup 4963 ms 4956 ms 7 ms ⚪ 0.0%

Results are median values from multiple test runs.

Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff)

@fredrikekelund fredrikekelund left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fundamentals LGTM. I still left some comments that would be good to address before merging, though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure I'd count this as a migration, since it's expected to run continuously… I think cron job is a better analogy.

I think we should just put this somewhere other than the migrations dir.

Comment thread apps/cli/process-manager-daemon.ts Outdated
Comment on lines +63 to +86
async function renameLegacyLogIfPresent(
processName: string,
stream: 'out' | 'error'
): Promise< void > {
const legacyPath = path.join( PROCESS_MANAGER_LOGS_DIR, `${ processName }-${ stream }.log` );
let stats: fs.Stats;
try {
stats = await fs.promises.stat( legacyPath );
} catch {
return;
}
const targetPath = path.join(
PROCESS_MANAGER_LOGS_DIR,
`${ processName }-${ stream }-${ formatLogDateTag( stats.mtime ) }.log`
);
if ( fs.existsSync( targetPath ) ) {
return;
}
try {
await fs.promises.rename( legacyPath, targetPath );
} catch {
// Best-effort; the migration will prune the legacy file eventually by mtime.
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In one sense, this is a nice addition, but in another, it will address a non-issue 14 days after we merge the other parts of this PR. Do you still think it's still a worthwhile addition, @bcotrim?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not, it made sense in one of the earlier explorations, but not in the final state.
Good catch


export const pruneOldPmLogs: Migration = {
async needsToRun() {
const stale = await findStalePmLogs( Date.now() );

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we'll keep using the migration interface for this file, but if we do, we should call findStalePmLogs only once. However minor the performance impact, there's really no good reason to call it twice.

@bcotrim bcotrim merged commit 33cb565 into trunk Apr 21, 2026
11 checks passed
@bcotrim bcotrim deleted the stu-1419-implement-log-rotation-for-pm-children-logs branch April 21, 2026 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants