Skip to content

Conversation

@etraut-openai
Copy link
Collaborator

@etraut-openai etraut-openai commented Nov 4, 2025

Currently, when the access token expires, we attempt to use the refresh token to acquire a new access token. This works most of the time. However, there are situations where the refresh token is expired, exhausted (already used to perform a refresh), or revoked. In those cases, the current logic treats the error as transient and attempts to retry it repeatedly.

This PR changes the token refresh logic to differentiate between permanent and transient errors. It also changes callers to treat the permanent errors as fatal rather than retrying them. And it provides better error messages to users so they understand how to address the problem. These error messages should also help us further understand why we're seeing examples of refresh token exhaustion.

Here is the error message in the CLI. The same text appears within the extension.

image

I also correct the spelling of "Re-connecting", which shouldn't have a hyphen in it.

Testing: I manually tested these code paths by adding temporary code to programmatically cause my refresh token to be exhausted (by calling the token refresh endpoint in a tight loop more than 50 times). I then simulated an access token expiration, which caused the token refresh logic to be invoked. I confirmed that the updated logic properly handled the error condition.

Note: We earlier discussed the idea of forcefully logging out the user at the point where token refresh failed. I made several attempts to do this, and all of them resulted in a bad UX. It's important to surface this error to users in a way that explains the problem and tells them that they need to log in again. We also previously discussed deleting the auth.json file when this condition is detected. That also creates problems because it effectively changes the auth status from logged in to logged out, and this causes odd failures and inconsistent UX. I think it's therefore better not to delete auth.json in this case. If the user closes the CLI or VSCE and starts it again, we properly detect that the access token is expired and the refresh token is "dead", and we force the user to go through the login flow at that time.

This should address aspects of #6191, #5679, and #5505

@etraut-openai
Copy link
Collaborator Author

@codex review

@chatgpt-codex-connector
Copy link
Contributor

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@etraut-openai etraut-openai marked this pull request as ready for review November 4, 2025 23:41
RefreshTokenError::Permanent(failed) => {
StreamAttemptError::Fatal(CodexErr::RefreshTokenFailed(failed))
}
RefreshTokenError::Transient(other) => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we retry these in manager.refresh_token()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to implement retry logic in multiple places. To do it properly, you need to apply backoff, have a hard-coded or configurable limit, etc. All of this logic is already in place in stream_responses. Unless there's a good reason to replicate it, let's leverage the existing retry logic.

}

fn classify_refresh_token_failure(body: &str) -> RefreshTokenFailedError {
let code = extract_refresh_token_error_code(body);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we extend try_parse_error_message to also return code? seem to be parsing the same style of json object.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function covers shapes that try_parse_error_message doesn't handle.

Copy link
Collaborator

@pakrym-oai pakrym-oai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add tests for any of this?

@etraut-openai etraut-openai merged commit c4ebe4b into main Nov 5, 2025
25 checks passed
@etraut-openai etraut-openai deleted the etraut/refresh_logout branch November 5, 2025 18:52
@github-actions github-actions bot locked and limited conversation to collaborators Nov 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants