pkg/query: Modify source report to return matched filenames#6157
Conversation
Updated the record filtering logic in sourceReportBuilder to treat an empty buildID as matching all records. So the new behaviour is: When buildId is empty → it skips buildId filtering, aggregates metrics from all builds matching the filename When buildId is provided → filters to that specific build
Introduces a matched_filenames field to the Source proto and related code, allowing clients to see all filenames that matched a query
|
✅ Meticulous spotted 0 visual differences across 328 screens tested: view results. Meticulous evaluated ~4 hours of user flows against your PR. Expected differences? Click here. Last updated for commit b517813. This comment will update as new commits are pushed. |
…ames The Source report Arrow record now includes a dictionary-encoded filename column, along with line_number, cumulative, and flat columns.
c2f535c to
91cb343
Compare
742e10c to
acb9398
Compare
acb9398 to
adc39ab
Compare
| lineData map[int64]*lineMetrics | ||
| cumulative int64 | ||
| lineData map[lineKey]*lineMetrics | ||
| filenameIntern map[string]string |
There was a problem hiding this comment.
I don't think we need this anymore, do we?
There was a problem hiding this comment.
Initially added this to deduplicate filename strings during processing i.e. if the same file appears 100 times in the profile, we'd create 100 separate string allocations as map keys.
There was a problem hiding this comment.
we already do this via the lineKey, and I'm not even sure if this would have done anything in go, because strings are copied in most cases by the go runtime
There was a problem hiding this comment.
or alteratively we make it map[string][]lineMetrics and add line number to the linemetrics struct and then we only need to sort the file names, and line metrics individually by file
There was a problem hiding this comment.
i like that. i'll make the change
| // It returns true if: | ||
| // - They are exactly equal, OR | ||
| // - profileFilename ends with queryFilename AND is preceded by a '/' (path boundary). | ||
| func filenameMatches(profileFilename, queryFilename []byte) bool { |
There was a problem hiding this comment.
Hmm but this is what we use for actually implementing suffix matching for local and build paths.
So for example, if the client passes pkg/sources/server.go locally, but the profiling data has /home/ci/build/pkg/sources/server.go from the build, filenameMatches will help with suffix matching to get the actual file name.
Or you're thinking we just return the data for every file found in the profiling data and then in the client, do a filter with the source file we care about?
There was a problem hiding this comment.
If we do the latter that might mean a larger response size right?
There was a problem hiding this comment.
apologies, you're right, in the future we should have this be part of the query against the database
This PR modifies the source report to return matching filenames for when the request is called without a buildId.
Changes