Skip to content
This repository was archived by the owner on Sep 30, 2024. It is now read-only.

Refresh SAMS access tokens as needed#62869

Merged
vdavid merged 5 commits intomainfrom
chrsmith/refresh-sams-token
May 24, 2024
Merged

Refresh SAMS access tokens as needed#62869
vdavid merged 5 commits intomainfrom
chrsmith/refresh-sams-token

Conversation

@chrsmith
Copy link
Contributor

The ssc.APIProxyHandler is an HTTP handler which proxies HTTP requests from Sourcegraph.com to the SSC backend, but replacing "Sourcegraph.com" credentials from the request with the user's "SAMS access token".

However, it's possible (extremely likely, actually) that the SAMS access token associated with the Sourcegraph user account has expired. Because a user's browser session will be valid for XX days, while the SAMS access token generated on login is only valid for YY minutes.

This PR updates the ssc.APIProxyHandler so that if it sees that a user's SAMS access token has expired, it will try to generate a new OAuth access token for the user via their SAMS refresh token.

ℹ️ The SAMS refresh token may also expire, but we don't actually know when. So it's possible that refreshing the user's access token will fail, and the SSC API Proxy will still return a 401 status code.

Meaning that even with this PR, the SSC UI will need to have some sort of "your credentials have expired, please logout and login again" error experience. Although with these changes, that should be far less frequent.

This PR just uses the existing support we have for refreshing OAuth access tokens (oauthtoken.GetAccountRefreshAndStoreOAuthTokenFunc), that is used for how the Sourcegraph instance authenticates with 3rd party identity providers (e.g. internal/authz/providers/github). So 🤞 this is everything we need?

Results

Testing locally it seemed to work as expected. (As if there were any doubt!) But looking at the logs, there definitely is (an unavoidable?) data race in refreshing the token.

The frontend send 3x HTTP requests in rapid success, all 3 of which went through the SAMS OAuth refresh process. (So while this is "a little slow", it doesn't appear to be a problem.)

However, this would be a problem if SAMS were to implement more aggressive logic around the reuse of OAuth refresh tokens. e.g. it could choose to only honor an OAuth refresh token once, and return a 400 if a user tries to use the same refresh token a second time.

The implementation has some logic to account for this, but there may be some hypothetical scenario where it would surface as a 500 error. (But given that this isn't the case for SAMS, I'm inclined to not worry about it.)

[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:235 proxying SSC API request {"url": "/.api/ssc/proxy/team/current/subscription"}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:235 proxying SSC API request {"url": "/.api/ssc/proxy/team/current/subscription/summary"}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:235 proxying SSC API request {"url": "/.api/ssc/proxy/team/current/members"}
[       frontend] WARN Handler.SSC Proxy ssc/ssc_proxy.go:265 the user's SAMS token has expired {"expiry": 1715989719041120000}
[       frontend] WARN Handler.SSC Proxy ssc/ssc_proxy.go:265 the user's SAMS token has expired {"expiry": 1715989719041120000}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:235 proxying SSC API request {"url": "/.api/ssc/proxy/team/current/invites"}
[       frontend] WARN Handler.SSC Proxy ssc/ssc_proxy.go:265 the user's SAMS token has expired {"expiry": 1715989719041120000}
[       frontend] WARN Handler.SSC Proxy ssc/ssc_proxy.go:265 the user's SAMS token has expired {"expiry": 1715989719041120000}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:229 refresh user's SAMS token {"new expiration": 1716413928649510000}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:101 Building proxy request {"method": "GET", "url": "http://localhost:19992/cody/api/v1/team/current/subscription"}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:229 refresh user's SAMS token {"new expiration": 1716413928649711000}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:101 Building proxy request {"method": "GET", "url": "http://localhost:19992/cody/api/v1/team/current/invites"}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:229 refresh user's SAMS token {"new expiration": 1716413928651633000}
[       frontend] INFO Handler.SSC Proxy ssc/ssc_proxy.go:101 Building proxy request {"method": "GET", "url": "http://localhost:19992/cody/api/v1/team/current/members"}

Test plan

Tested manually.

I didn't add unit tests for the new functionality, since it would amount to a whole lot of mocking for not much benefit.

@chrsmith chrsmith requested review from a team, bobheadxi and unknwon May 22, 2024 21:50
@cla-bot cla-bot bot added the cla-signed label May 22, 2024
Comment on lines 232 to 238
Copy link
Member

@bobheadxi bobheadxi May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we get Sentry for MSP but I don't think we have the same Sentry enabled for dotcom, as it looks very quiet: https://sourcegraph.sentry.io/projects/dot-com-backend/?project=5596888

If you want to be notified on this, you have to monitor logs directly, or we can look into re-enabling Sentry for dotcom. If we re-enable Sentry I'd like to be explicit about allowlisting only errors from certain logger scopes, as IIRC it was disabled because it was super noisy and nobody ever bothered to look - in which case, maybe you can use samsAuthenticator.logger which has a scope so that we can leave this door open?

@unknwon
Copy link
Contributor

unknwon commented May 22, 2024

This PR just uses the existing support we have for refreshing OAuth access tokens (oauthtoken.GetAccountRefreshAndStoreOAuthTokenFunc

If you care at all 😃 I wrote initial version of this helper hahaha https://github.com/sourcegraph/sourcegraph/pull/39840, now come in full circle.

The frontend send 3x HTTP requests in rapid success, all 3 of which went through the SAMS OAuth refresh process. (So while this is "a little slow", it doesn't appear to be a problem.)

Ideally, we do want a single-flight guarantee for refreshing OAuth tokens.

However, this would be a problem if SAMS were to implement more aggressive logic around the reuse of OAuth refresh tokens. e.g. it could choose to only honor an OAuth refresh token once, and return a 400 if a user tries to use the same refresh token a second time.

It does... 😂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be using one of the preconfigured clients that we have, e.g. httpcli.UncachedExternalDoer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review and comment, @bobheadxi!
I need a bit of clarification, as Chris is out till Wednesday, and I don't understand the severity of the comment. Is it
a.) a functional issue (probably not as otherwise Joe probably wouldn't have approved the PR),
b.) a performance issue (then it might be worth it to fix it before merging), or
c.) more like a tech debt / DRY / elegance thing?

Based on the answer, what do you reckon is the best call here?
a.) We (likely Chris) do this as a follow-up. I can make sure we have a ticket so it's not forgotten.
b.) If you have time, you or Joe pair with me tomorrow to update this or at least give me some pointers, and then I test it again. It would probably take me an unreasonable amount of time to figure this out on my own.
c.) Something else?

Thanks :)

Copy link
Member

@bobheadxi bobheadxi May 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look up the symbol I mentioned in my comment, you'll see it's just a shared internal library for standardized HTTP clients with some preconfigured o11y, timeouts, retry handling, etc. The suggestion is to just do a variable replacement to replace the http.DefaultClient here with one of those, so that this code can benefit from all that stuff as well :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks, I can definitely do that :)

@vdavid vdavid force-pushed the chrsmith/refresh-sams-token branch from 32e0230 to ae01e8e Compare May 24, 2024 07:21
@vdavid vdavid enabled auto-merge (squash) May 24, 2024 07:23
@vdavid vdavid disabled auto-merge May 24, 2024 07:23
@vdavid vdavid enabled auto-merge (squash) May 24, 2024 07:24
@vdavid vdavid merged commit 38e8499 into main May 24, 2024
@vdavid vdavid deleted the chrsmith/refresh-sams-token branch May 24, 2024 07:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants