Skip to content

feat: add local/ routing prefix for model IDs with embedded slashes#3165

Open
petterreinholdtsen wants to merge 1 commit into
ultraworkers:mainfrom
petterreinholdtsen:local-api-routing
Open

feat: add local/ routing prefix for model IDs with embedded slashes#3165
petterreinholdtsen wants to merge 1 commit into
ultraworkers:mainfrom
petterreinholdtsen:local-api-routing

Conversation

@petterreinholdtsen
Copy link
Copy Markdown

Summary

Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would strip 'Qwen/' as a provider prefix, sending only the remainder on the wire and causing 403 Forbidden from the server.

The new local/ prefix is an escape hatch: it strips just 'local/' and sends everything after it verbatim, preserving embedded slashes in model IDs.

Usage: --model local/Qwen/Qwen3.6-27B-FP8

This patch was developed with assistance from OpenCode and local Qwen 3.6 API server.

Anti-slop triage

I do not undersnad these fields:

  • Classification:
  • Evidence:
  • Non-destructive review result:

Verification

I do not understand the two first of these check list points.

  • Targeted tests/docs checks ran, or the gap is explicitly recorded.
  • git diff --check passes.
  • No live secrets, tokens, private logs, or unrelated generated churn are included.

Resolution gate

I do not understand these check list points:

  • If this PR resolves an issue, the issue number and fix evidence are linked.
  • If this PR should not merge, the rejection/defer rationale is evidence-backed and does not rely on vibes.
  • I did not merge/close remote PRs or issues from an automation lane without owner approval.

I believe this is related to #3036, but unsure if it fix it.

Some backends (e.g., gpt.uio.no running Ollama) have model IDs containing
slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic would
strip 'Qwen/' as a provider prefix, sending only the remainder on the wire
and causing 403 Forbidden from the server.

The new local/ prefix is an escape hatch: it strips just 'local/' and sends
everything after it verbatim, preserving embedded slashes in model IDs.

Usage: --model local/Qwen/Qwen3.6-27B-FP8

This patch was developed with assistance from OpenCode and local Qwen 3.6
API server.
@petterreinholdtsen
Copy link
Copy Markdown
Author

I asked opencode to keep notes on what it was solving, and here is what it ended up with for this patch. I've removed secrets and private DNS names.

Crash: "Access denied to model" on corporate API server (routing prefix stripping) [SOLVED]

  • Error: api returned 403 Forbidden: Access denied to model: Qwen3.6-27B-FP8
  • Root cause: Corporate API server has model IDs containing slashes like Qwen/Qwen3.6-27B-FP8. The existing routing prefix logic in wire_model_for_base_url() stripped Qwen/ as a known provider prefix, sending only Qwen3.6-27B-FP8 on the wire — which the server doesn't recognize (403). Additionally, validate_model_syntax() rejected 3+ parts when split by /, so even if you tried to work around it, the CLI would refuse the model name.
  • Key files:
    • rust/crates/api/src/providers/openai_compat.rs:920-933strip_routing_prefix() now accepts "local" as a routing prefix
    • rust/crates/api/src/providers/openai_compat.rs:964-975wire_model_for_base_url() handles "local/" escape hatch
    • rust/crates/rusty-claude-cli/src/main.rs:1654-1658validate_model_syntax() now allows embedded slashes for local/ prefix

What was done

  • Added "local" to the routing prefixes in both strip_routing_prefix() and wire_model_for_base_url(). The local/ prefix is an escape hatch: it strips just local/ and sends everything after it verbatim on the wire, preserving embedded slashes in model IDs.
  • Updated validate_model_syntax() to allow more than 2 parts when split by / for models starting with local/, so local/Qwen/Qwen3.6-27B-FP8 passes validation.
  • Use: --model local/Qwen/Qwen3.6-27B-FP8 sends the full ID on the wire.

Test command

echo "Say hello." | OPENAI_BASE_URL=https://corporate.example.com/api/v1/chat/completions OPENAI_API_KEY=secret ~/src/claw-code-upstream/rust/target/debug/claw --model local/Qwen/Qwen3.6-27B-FP8

@1716775457damn
Copy link
Copy Markdown

The local/ escape hatch is a clean solution for model IDs with embedded slashes. Much better than trying to guess which / is a provider separator vs part of the model name. This should also help with the broader Ollama compatibility issues in #3123.

Copy link
Copy Markdown

@Noobzik Noobzik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more test case then it should be good for the maintainer to approve
@code-yeongyu

"Qwen/Qwen3.6-27B-FP8"
);
assert_eq!(super::strip_routing_prefix("local/mistralai/Mistral-Large-3"), "mistralai/Mistral-Large-3");
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be one more case where you have local/some_random_model:7b that should assert to local/some_random_model:7b

Overwise, that look's good for me

@1716775457damn
Copy link
Copy Markdown

Nice approach. The local/ prefix convention is clean and doesn't conflict with existing provider routing. One thought: since the model ID after local/ is used as-is, servers with query-string authentication (e.g. http://localhost:1234/v1?key=xxx) could have their key leaked into model_id. Consider either stripping query params, or documenting that local/ expects just the host:port and paths, not query strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants