feat(pivotp): Smarter pivotp with moarstats#3606
Merged
Merged
Conversation
When moarstats has been run on input data, pivotp's `--agg smart` mode now uses advanced statistics (kurtosis, bimodality, outlier profile, entropy, Gini coefficient) for better aggregation choices. Also makes moarstats regenerate the .stats.csv.data.jsonl cache so downstream commands can see the enriched statistics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- pivotp: differentiate string branch when normalized_entropy is available — high-cardinality and low-entropy cases now select First instead of duplicating Len across all branches - moarstats: guard against non-UTF-8 output paths with early return instead of silently using an empty string - tests: clarify bimodal test assertion with comment explaining when each aggregation (Len vs Median) fires Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends pivotp --agg smart to leverage additional statistics produced by moarstats, by plumbing moarstats-generated columns through the stats JSONL cache and adding targeted integration tests.
Changes:
- Add moarstats-derived fields (e.g., kurtosis, bimodality coefficient, entropy, outlier profile) to
StatsDataand the stats-to-JSONL typing map. - Update
pivotpsmart aggregation selection logic to consider bimodality, outliers, kurtosis, Gini, and entropy (when available). - Regenerate the
.stats.csv.data.jsonlcache at the end ofmoarstatsso other “smart” commands can read the new columns; add pivotp tests for the new behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| tests/test_pivotp.rs | Adds integration tests for smart aggregation with/without moarstats-generated stats. |
| src/cmd/stats.rs | Extends StatsData and STATSDATA_TYPES_MAP to include moarstats columns for JSONL cache deserialization. |
| src/cmd/pivotp.rs | Enhances --agg smart logic using moarstats metrics and updates CLI help text accordingly. |
| src/cmd/moarstats.rs | Regenerates stats JSONL cache after writing the augmented stats CSV. |
…advanced to tests - moarstats: use if-let instead of early return to avoid skipping temp file cleanup - pivotp USAGE: clarify which stats need --advanced vs base moarstats - tests: add --advanced flag to kurtosis/bimodal tests, fix misleading comment Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.