feat: add local/ routing prefix for model IDs with embedded slashes#3165
feat: add local/ routing prefix for model IDs with embedded slashes#3165petterreinholdtsen wants to merge 1 commit into
Conversation
Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would strip 'Qwen/' as a provider prefix, sending only the remainder on the wire and causing 403 Forbidden from the server. The new local/ prefix is an escape hatch: it strips just 'local/' and sends everything after it verbatim, preserving embedded slashes in model IDs. Usage: --model local/Qwen/Qwen3.6-27B-FP8 This patch was developed with assistance from OpenCode and local Qwen 3.6 API server.
|
I asked opencode to keep notes on what it was solving, and here is what it ended up with for this patch. I've removed secrets and private DNS names. Crash: "Access denied to model" on corporate API server (routing prefix stripping) [SOLVED]
What was done
Test commandecho "Say hello." | OPENAI_BASE_URL=https://corporate.example.com/api/v1/chat/completions OPENAI_API_KEY=secret ~/src/claw-code-upstream/rust/target/debug/claw --model local/Qwen/Qwen3.6-27B-FP8 |
|
The local/ escape hatch is a clean solution for model IDs with embedded slashes. Much better than trying to guess which / is a provider separator vs part of the model name. This should also help with the broader Ollama compatibility issues in #3123. |
There was a problem hiding this comment.
One more test case then it should be good for the maintainer to approve
@code-yeongyu
| "Qwen/Qwen3.6-27B-FP8" | ||
| ); | ||
| assert_eq!(super::strip_routing_prefix("local/mistralai/Mistral-Large-3"), "mistralai/Mistral-Large-3"); | ||
| } |
There was a problem hiding this comment.
I think there should be one more case where you have local/some_random_model:7b that should assert to local/some_random_model:7b
Overwise, that look's good for me
|
Nice approach. The local/ prefix convention is clean and doesn't conflict with existing provider routing. One thought: since the model ID after local/ is used as-is, servers with query-string authentication (e.g. http://localhost:1234/v1?key=xxx) could have their key leaked into model_id. Consider either stripping query params, or documenting that local/ expects just the host:port and paths, not query strings. |
Summary
Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would strip 'Qwen/' as a provider prefix, sending only the remainder on the wire and causing 403 Forbidden from the server.
The new local/ prefix is an escape hatch: it strips just 'local/' and sends everything after it verbatim, preserving embedded slashes in model IDs.
Usage: --model local/Qwen/Qwen3.6-27B-FP8
This patch was developed with assistance from OpenCode and local Qwen 3.6 API server.
Anti-slop triage
I do not undersnad these fields:
Verification
I do not understand the two first of these check list points.
git diff --checkpasses.Resolution gate
I do not understand these check list points:
I believe this is related to #3036, but unsure if it fix it.