Conversation
There was a problem hiding this comment.
FYI we get Sentry for MSP but I don't think we have the same Sentry enabled for dotcom, as it looks very quiet: https://sourcegraph.sentry.io/projects/dot-com-backend/?project=5596888
If you want to be notified on this, you have to monitor logs directly, or we can look into re-enabling Sentry for dotcom. If we re-enable Sentry I'd like to be explicit about allowlisting only errors from certain logger scopes, as IIRC it was disabled because it was super noisy and nobody ever bothered to look - in which case, maybe you can use samsAuthenticator.logger which has a scope so that we can leave this door open?
If you care at all 😃 I wrote initial version of this helper hahaha https://github.com/sourcegraph/sourcegraph/pull/39840, now come in full circle.
Ideally, we do want a single-flight guarantee for refreshing OAuth tokens.
It does... 😂 |
There was a problem hiding this comment.
Should we be using one of the preconfigured clients that we have, e.g. httpcli.UncachedExternalDoer?
There was a problem hiding this comment.
Thanks for the review and comment, @bobheadxi!
I need a bit of clarification, as Chris is out till Wednesday, and I don't understand the severity of the comment. Is it
a.) a functional issue (probably not as otherwise Joe probably wouldn't have approved the PR),
b.) a performance issue (then it might be worth it to fix it before merging), or
c.) more like a tech debt / DRY / elegance thing?
Based on the answer, what do you reckon is the best call here?
a.) We (likely Chris) do this as a follow-up. I can make sure we have a ticket so it's not forgotten.
b.) If you have time, you or Joe pair with me tomorrow to update this or at least give me some pointers, and then I test it again. It would probably take me an unreasonable amount of time to figure this out on my own.
c.) Something else?
Thanks :)
There was a problem hiding this comment.
If you look up the symbol I mentioned in my comment, you'll see it's just a shared internal library for standardized HTTP clients with some preconfigured o11y, timeouts, retry handling, etc. The suggestion is to just do a variable replacement to replace the http.DefaultClient here with one of those, so that this code can benefit from all that stuff as well :)
There was a problem hiding this comment.
Ah, thanks, I can definitely do that :)
Co-authored-by: Robert Lin <robert@bobheadxi.dev>
32e0230 to
ae01e8e
Compare
The
ssc.APIProxyHandleris an HTTP handler which proxies HTTP requests from Sourcegraph.com to the SSC backend, but replacing "Sourcegraph.com" credentials from the request with the user's "SAMS access token".However, it's possible (extremely likely, actually) that the SAMS access token associated with the Sourcegraph user account has expired. Because a user's browser session will be valid for XX days, while the SAMS access token generated on login is only valid for YY minutes.
This PR updates the
ssc.APIProxyHandlerso that if it sees that a user's SAMS access token has expired, it will try to generate a new OAuth access token for the user via their SAMS refresh token.This PR just uses the existing support we have for refreshing OAuth access tokens (
oauthtoken.GetAccountRefreshAndStoreOAuthTokenFunc), that is used for how the Sourcegraph instance authenticates with 3rd party identity providers (e.g.internal/authz/providers/github). So 🤞 this is everything we need?Results
Testing locally it seemed to work as expected. (As if there were any doubt!) But looking at the logs, there definitely is (an unavoidable?) data race in refreshing the token.
The frontend send 3x HTTP requests in rapid success, all 3 of which went through the SAMS OAuth refresh process. (So while this is "a little slow", it doesn't appear to be a problem.)
However, this would be a problem if SAMS were to implement more aggressive logic around the reuse of OAuth refresh tokens. e.g. it could choose to only honor an OAuth refresh token once, and return a 400 if a user tries to use the same refresh token a second time.
The implementation has some logic to account for this, but there may be some hypothetical scenario where it would surface as a 500 error. (But given that this isn't the case for SAMS, I'm inclined to not worry about it.)
Test plan
Tested manually.
I didn't add unit tests for the new functionality, since it would amount to a whole lot of mocking for not much benefit.