CLI: Add log rotation for process manager children logs#3132
Conversation
📊 Performance Test ResultsComparing 774b913 vs trunk app-size
site-editor
site-startup
Results are median values from multiple test runs. Legend: 🟢 Improvement (faster) | 🔴 Regression (slower) | ⚪ No change (<50ms diff) |
fredrikekelund
left a comment
There was a problem hiding this comment.
The fundamentals LGTM. I still left some comments that would be good to address before merging, though
There was a problem hiding this comment.
I'm not so sure I'd count this as a migration, since it's expected to run continuously… I think cron job is a better analogy.
I think we should just put this somewhere other than the migrations dir.
| async function renameLegacyLogIfPresent( | ||
| processName: string, | ||
| stream: 'out' | 'error' | ||
| ): Promise< void > { | ||
| const legacyPath = path.join( PROCESS_MANAGER_LOGS_DIR, `${ processName }-${ stream }.log` ); | ||
| let stats: fs.Stats; | ||
| try { | ||
| stats = await fs.promises.stat( legacyPath ); | ||
| } catch { | ||
| return; | ||
| } | ||
| const targetPath = path.join( | ||
| PROCESS_MANAGER_LOGS_DIR, | ||
| `${ processName }-${ stream }-${ formatLogDateTag( stats.mtime ) }.log` | ||
| ); | ||
| if ( fs.existsSync( targetPath ) ) { | ||
| return; | ||
| } | ||
| try { | ||
| await fs.promises.rename( legacyPath, targetPath ); | ||
| } catch { | ||
| // Best-effort; the migration will prune the legacy file eventually by mtime. | ||
| } | ||
| } |
There was a problem hiding this comment.
In one sense, this is a nice addition, but in another, it will address a non-issue 14 days after we merge the other parts of this PR. Do you still think it's still a worthwhile addition, @bcotrim?
There was a problem hiding this comment.
Probably not, it made sense in one of the earlier explorations, but not in the final state.
Good catch
|
|
||
| export const pruneOldPmLogs: Migration = { | ||
| async needsToRun() { | ||
| const stale = await findStalePmLogs( Date.now() ); |
There was a problem hiding this comment.
I'm not sure we'll keep using the migration interface for this file, but if we do, we should call findStalePmLogs only once. However minor the performance impact, there's really no good reason to call it twice.
…og-rotation-for-pm-children-logs
Related issues
How AI was used in this PR
Built collaboratively via Claude Code. I iterated with the assistant on the design (migration mechanism vs. daemon-side logic, mtime vs. date-tag as the pruning signal, legacy file handling), reviewed each proposed change, and verified the implementation and tests before each iteration. Tests, eslint, and typecheck were run after each change.
Proposed Changes
The PM daemon was appending stdout/stderr from Playground CLI child processes to
~/.studio/daemon/logs/{name}-{out|error}.logforever, with no rotation or cleanup. This change bounds the log dir via two cooperating pieces:1. Daemon — dated log filenames
{name}-{out|error}-YYYYMMDD.log, whereYYYYMMDDis the date at session start. Same-day restarts append to the existing file; next-day starts create a fresh file, leaving yesterday's file on disk to age out naturally.2.
prunePmLogs()sweep (apps/cli/lib/prune-pm-logs.ts)studiocommand.{name}-{out|error}(-YYYYMMDD)?.log) whosemtimeis older than 14 days. Uniform rule for both legacy undated files and dated files.mtime(not the filename date tag) as the prune signal, so actively-written files (e.g. a long-running session that spans multiple days) are never deleted — safe on Windows, safe for continuously-running sites.Files touched
apps/cli/process-manager-daemon.ts— dated log filenamesapps/cli/lib/prune-pm-logs.ts— singleprunePmLogs()functionapps/cli/lib/tests/prune-pm-logs.test.ts— unit tests (6 cases)apps/cli/index.ts— invokeprunePmLogs()in the yargs middlewareReview changes
migrations/and intoapps/cli/lib/as a plain function — it's really a cron-shaped task, not a one-shot migration.renameLegacyLogIfPresent) — the mtime-based sweep already handles legacy files correctly, and that code path becomes dead within 14 days of merge.needsToRun/runsplit into a single pass (onereaddir+stats, one unlink loop).Known limitations / follow-ups
prunePmLogs()and the existingupdateServerFiles()are both cron-shaped tasks. A follow-up could extract a shared interface with per-task frequency and last-run timestamps incli.json.Testing Instructions
Automated
Manual smoke tests
Build the CLI first:
Use an isolated logs dir to keep your real
~/.studiountouched:1. Dated filenames on fresh start
Expect:
*-out-YYYYMMDD.logand*-error-YYYYMMDD.logwith today's date. Each log line is prefixed with an ISO timestamp.2. Pruning sweep (mixed stale + fresh + non-log)
Expect after the run:
site-out-20230101.log— deleted (stale dated)old-site-error.log— deleted (stale legacy undated)active-site-out-20250101.log— preserved (fresh mtime despite ancient filename tag)README.txt— preserved (not a log-pattern file)3. Same-day restart appends
Expect: one
*-out-YYYYMMDD.logper stream; file size grew. No-001suffix.4. Next-day start creates a fresh file
Simulate by creating a file with yesterday's tag, then start a site today:
Expect: both yesterday's and today's files coexist; today's was just created.
Cleanup:
rm -rf $STUDIO_PROCESS_MANAGER_HOMEPre-merge Checklist