-
Notifications
You must be signed in to change notification settings - Fork 14.1k
llama-server: fix duplicate HTTP headers in multiple models mode #17698
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note: I first tried a is_proxied flag approach but it required more code with logic split between modules. Filtering at source is simpler. |
ngxson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good overall! would appropriate if you can address some small comments
|
I just want to confirm that this PR solves my issue, nginx errors are gone and |
- restrict scope of header after std::move - simplify header check (remove unordered_set)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, thanks! (merging once the CI passes)
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
…17698) * llama-server: fix duplicate HTTP headers in multiple models mode (ggml-org#17693) * llama-server: address review feedback from ngxson - restrict scope of header after std::move - simplify header check (remove unordered_set)
Make sure to read the contributing guidelines before submitting a PR
Approach: Filter at source
This patch filters headers before forwarding them to avoid duplication.
Why headers get duplicated:
When the router proxies child process responses, both the router (via
set_default_headers) and the child send the same headers (Server,
Transfer-Encoding, Keep-Alive, CORS). The proxy was forwarding everything,
resulting in duplicates.
Solution:
Skip headers that will be added by the router or httplib:
Handle Content-Type separately via msg_t.content_type to avoid duplication
when httplib calls set_chunked_content_provider() or set_content().
Tested with:
Before: duplicate Server, Transfer-Encoding, Keep-Alive, Access-Control-Allow-Origin, Content-Type
After: all headers appear exactly once
Fixes #17693