Add various object store cache Prometheus metrics#519
Merged
Conversation
mildbyte
commented
Apr 23, 2024
| self.inner.abort_multipart(location, multipart_id).await | ||
| } | ||
|
|
||
| async fn get( |
Contributor
Author
There was a problem hiding this comment.
(get is implemented in terms of get_opts nowadays, so just doing get_opts is sufficient)
mildbyte
commented
Apr 23, 2024
| ) -> BoxStream<'_, object_store::Result<ObjectMeta>> { | ||
| self.inner.list(prefix) | ||
| let start = Instant::now(); | ||
| let result = self.inner.list(prefix); |
Contributor
Author
There was a problem hiding this comment.
e.g. the p50 latency here is 3.31 microseconds, surely it's not connecting to Minio this quickly (although who knows)
gruuya
approved these changes
Apr 23, 2024
src/object_store/cache.rs
Outdated
| "Re-downloading cache value for {key:?}: {err}" | ||
| ); | ||
|
|
||
| self.metrics.redownload_errors.increment(1); |
Contributor
There was a problem hiding this comment.
"redownload_errors" doesn't seem appropriate here, more like "cache_file_missing", "file_read_error" or "eviction_race".
Comment on lines
573
to
577
| warn!("Invalidating cache entry for {key:?}; failed writing to a file: {err}"); | ||
| cache.invalidate(&key).await; |
Contributor
There was a problem hiding this comment.
We could also use a counter for these.
Contributor
There was a problem hiding this comment.
By these I mean counter for cache file write errors.
Contributor
Author
There was a problem hiding this comment.
Oh yeah, good idea, I missed it somehow
213dbae to
5445458
Compare
20264b4 to
aa8d012
Compare
[skip ci]
get and put now just call get/put_opts (so does get_range, but we actually override it) and the S3 object store only implements _opts, so delete the basic get/put methods to simplify things.
Use Prometheus conventions for metrics and merge some of them to make them easier to manage, also add basic latencies.
aa8d012 to
28dd821
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a bunch of various timing / sizing metrics to the object store cache:
Limitations:
This is based on top of #518 which has some DataFusion MemoryManager metrics
Sample metrics scrape from a TPC-H/DS run: