-
Notifications
You must be signed in to change notification settings - Fork 6.8k
[app-server] feat: v2 Turn APIs #6216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fdbb4a0 to
01ba255
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
ae02487 to
90b3e74
Compare
90b3e74 to
a53e53c
Compare
a6d52b2 to
c04b619
Compare
| }, | ||
| #[serde(rename = "turn/start")] | ||
| #[ts(rename = "turn/start")] | ||
| TurnStart { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while we are here, how do we want to implement best of N parameters, here?
should TurnStartResponse support list of turns as response?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My gut says to let bon live a level above to avoid making a single API too complex.
bon in the most basic sense is a for loop around TurnStart. However, it may get more involved over time where you have a second layer that analyzes the turns and then kicks off another, etc and not coupling that to this API gives us more options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I do think there's something really nice about keeping TurnStart as simple as possible and have it be a primitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apanasenko-oai do you all need first-class BoN support in app-server for codex cloud?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One issue with having BoN live outside the API is that we still need to link threads/turns. That means either calling a special API on the app-server to handle the linking, or managing the storage externally. But if we manage it externally, we end up having to handle cloud and local separately, which adds complexity.
I agree we should keep the API simple, but BoN feels like a core part of the product for both local and cloud environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think the API should make BoN easy, but needs some design work 🤔
maybe Threads need to start looking like a Tree instead of a linear list of Turns, for example (depending on the level of visibility / control we want to give to the user).
| use tracing::warn; | ||
| use uuid::Uuid; | ||
|
|
||
| type PendingInterruptQueue = Vec<(RequestId, ApiVersion)>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since turn/interrupt (and the old API) sends a response upon consume an EventMsg::TurnAborted from codex core, we need to store the API version along with the pending interrupt so we know whether to return a V1 or a V2 response
| #[serde(rename_all = "camelCase")] | ||
| #[ts(export_to = "v2/")] | ||
| pub struct TurnInterruptParams { | ||
| pub thread_id: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a thread_id globally unique, btw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be, it's a UUID7: https://github.com/openai/codex/blob/owen/turn_apis/codex-rs/protocol/src/conversation_id.rs#L20
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turn IDs are not actually and I think we want them to be, otherwise we will always need (thread_id, turn_id) to uniquely identify a Turn
filed: https://linear.app/openai/issue/CLI-1432/make-turn-ids-unique
| message: format!("failed to list conversations: {err}"), | ||
| data: None, | ||
| }; | ||
| Ok(result) => result, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would handle the send_response() in here and then you can omit the return;.
| r#" | ||
| model = "mock-model" | ||
| approval_policy = "never" | ||
| sandbox_mode = "workspace-write" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, read-only works!
|
|
||
| #[tokio::test] | ||
| async fn turn_start_exec_approval_toggle_v2() -> Result<()> { | ||
| if env::var(CODEX_SANDBOX_NETWORK_DISABLED_ENV_VAR).is_ok() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip_if_no_network!(); ?
| r#" | ||
| model = "mock-model" | ||
| approval_policy = "{approval_policy}" | ||
| sandbox_mode = "workspace-write" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it write? or is it important to test the override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests still pass with read-only 👍
Implements:
along with their integration tests. These are relatively light wrappers around the existing core logic, and changes to core logic are minimal.
However, an improvement made for developer ergonomics:
turn/startreplaces bothSendUserMessage(no turn overrides) andSendUserTurn(can override model, approval policy, etc.)