-
Notifications
You must be signed in to change notification settings - Fork 81
Closed
Description
When running an embedding model via Docker Model Runner, the runtime configuration (--ubatch-size, context_size, etc.) behaves inconsistently depending on how the request is sent.
Using Docker Desktop GUI:
The model starts with the custom configuration (e.g., ubatch-size=2048).
Using curl from inside a container:
The model falls back to default values (ubatch-size=512, n_ctx=4096).
This makes it impossible to reliably control the physical batch size.
Steps to Reproduce
- docker-compose.yaml
models:
embedding:
model: ai/embeddinggemma:300M-Q8_0
context_size: 2048
runtime_flags:
- "--ubatch-size"
- "2048"
services:
curl-tester:
image: curlimages/curl:8.11.1
command: ["sh", "-lc", "sleep 1000000"]
models:
embedding:
endpoint_var: EMBEDDING_ENDPOINT
model_var: EMBEDDING_MODEL
- Trigger a request via Docker Desktop GUI
Observe logs:
[2025-09-23T17:19:35.420131000Z] llama_context: constructing llama_context
[2025-09-23T17:19:35.420164000Z] llama_context: n_seq_max = 1
[2025-09-23T17:19:35.420178000Z] llama_context: n_ctx = 2048
[2025-09-23T17:19:35.420190000Z] llama_context: n_ctx_per_seq = 2048
[2025-09-23T17:19:35.420200000Z] llama_context: n_batch = 2048
[2025-09-23T17:19:35.420210000Z] llama_context: n_ubatch = 2048
- Trigger a request via curl inside the container
docker exec -it tailscale-curl-tester-1 sh -lc '
echo "EMBED:" $EMBEDDING_ENDPOINT $EMBEDDING_MODEL
curl -sS "$EMBEDDING_ENDPOINT/embeddings" \
-H "Authorization: Bearer dummy" \
-H "Content-Type: application/json" \
-d "{\"model\":\"$EMBEDDING_MODEL\",\"input\":\"hello world\"}" \
| head -c 300; echo
'
Observe logs:
[2025-09-23T17:21:57.905751000Z] llama_context: constructing llama_context
[2025-09-23T17:21:57.905780000Z] llama_context: n_ctx = 4096
[2025-09-23T17:21:57.905798000Z] llama_context: n_ctx_per_seq = 4096
[2025-09-23T17:21:57.905807000Z] llama_context: n_batch = 2048
[2025-09-23T17:21:57.905815000Z] llama_context: n_ubatch = 512
Metadata
Metadata
Assignees
Labels
No labels