Skip to content

feat(parcacol): use distinct for label discovery to reduce rows returned#5921

Merged
brancz merged 2 commits into
parca-dev:mainfrom
secfree:250825-reduce-labels-rows
Sep 17, 2025
Merged

feat(parcacol): use distinct for label discovery to reduce rows returned#5921
brancz merged 2 commits into
parca-dev:mainfrom
secfree:250825-reduce-labels-rows

Conversation

@secfree
Copy link
Copy Markdown
Contributor

@secfree secfree commented Aug 26, 2025

Replace Project with Distinct in Querier.Labels. This emits only unique label columns, significantly reducing the number of rows materialized and returned.

In my cluster, the query of "Labels" returns more than 100 million rows from the underlying database. This PR reduced it to less than 50,000.

@secfree secfree requested a review from a team as a code owner August 26, 2025 04:39
@secfree
Copy link
Copy Markdown
Contributor Author

secfree commented Sep 17, 2025

@brancz could you help review this PR? Thank you.

Copy link
Copy Markdown
Member

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Comment thread pkg/parcacol/querier.go
Project(logicalplan.DynCol(profile.ColumnLabels)).
Distinct(logicalplan.DynCol(profile.ColumnLabels)).
Execute(ctx, func(ctx context.Context, r arrow.Record) error {
r.Retain()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@brancz
Copy link
Copy Markdown
Member

brancz commented Sep 17, 2025

thanks for the ping, I was on PTO and this pushed it to the top of my inbox

@brancz brancz merged commit b8bd784 into parca-dev:main Sep 17, 2025
32 of 34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants