Feat/llm verify endpoint#3410
Conversation
Sends a minimal one-token completion to confirm credentials and connectivity work end-to-end. Reuses _prepare_completion_params so the verify call path exercises the same provider-aware option normalization (Azure max_tokens rename, OpenRouter headers, extended-thinking budgets, etc.) as a real completion, but deliberately bypasses both the retry decorator and the fallback strategy so the first real provider error is surfaced as a typed SDK exception (LLMAuthenticationError, LLMRateLimitError, LLMTimeoutError, LLMServiceUnavailableError, LLMBadRequestError) via map_provider_exception. Why bypass retry/fallback for verify: - Retry: verify should fail fast. Retrying a 401 wastes the user's budget and delays the inevitable auth error. - Fallback: if the user is verifying model X, they want to know if X works. Silently substituting model Y would defeat the entire point of 'verify before save'. The companion agent-server endpoint POST /llm/verify is added in a separate commit. Co-authored-by: openhands <openhands@all-hands.dev>
Exposes LLM.averify over HTTP so clients (notably the agent-canvas frontend) can delegate credential verification to the agent server rather than reimplementing provider-specific routing in the browser. Wire shape: VerifyLLMRequest accepts the LLM fields the settings UI exposes (model, api_key, base_url, api_version, plus the AWS fields for Bedrock). VerifyLLMResponse carries a discriminated status and an optional human-readable message + inferred provider name. Status values: - success: probe succeeded; credentials work. - auth_error: credentials rejected; client should block save. - rate_limited: provider returned 429; credentials are valid. - timeout: request timed out; warn but allow save. - unreachable: network/DNS/5xx; warn but allow save. - bad_request: credentials valid but config (e.g. model name) is wrong. - unknown_error: catch-all so the endpoint never raises. The endpoint always returns HTTP 200 — clients branch on the status field rather than wrapping the request in HTTP error handling. This keeps the frontend state machine to a single switch statement. Why move verify server-side: - All LiteLLM-supported providers become verifiable, including Bedrock with SigV4/IAM and Azure with api_version (both impossible from a browser today). - The verify call path is identical to the conversation-time call path, so 'verified' really means 'the agent will be able to use this'. - No CORS proxy / browser-side provider routing to maintain as LiteLLM adds providers. Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED |
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new LLM verify API and SDK probe end-to-end with a live agent-server and a local OpenAI-compatible provider.
Does this PR achieve its stated goal?
Yes. The PR adds the missing LLM verification capability: on the base branch, POST /api/llm/verify returned 404 and LLM had no verify() method; on the PR commit, the same real HTTP path returned status=success, invalid credentials returned structured status=auth_error with HTTP 200, and both LLM.verify() and LLM.averify() completed successfully. The local provider logs confirm the agent server made real LiteLLM chat-completion probes using the unmasked API key and max_completion_tokens: 1.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed and installed the repo packages. |
| CI Status | ⏳ Observed 13 passing, 0 failing, 13 pending checks at QA time. |
| Functional Verification | ✅ Base-vs-PR behavior verified through actual HTTP and SDK calls. |
Functional Verification
Test 1: New /api/llm/verify endpoint exists and performs a real provider probe
Step 1 — Establish baseline without the PR:
Checked out origin/main, synced dependencies, started the real agent server, then ran:
curl -sS -i -X POST http://127.0.0.1:18080/api/llm/verify -H 'Content-Type: application/json' -d '{"model":"openai/qa-model","api_key":"sk-valid","base_url":"http://127.0.0.1:18081/v1"}'Observed:
HTTP/1.1 404 Not Found
content-type: application/json
{"detail":"Not Found"}
This confirms the endpoint did not exist on the base branch.
Step 2 — Apply the PR changes:
Checked out a326ca2bb5ae9b4ce2e3a3a79a7e4ae46bb22b01, synced dependencies, started the agent server, and started a local OpenAI-compatible stub at http://127.0.0.1:18081/v1/chat/completions.
Step 3 — Re-run with the PR in place:
Ran the same request with valid credentials:
HTTP/1.1 200 OK
content-type: application/json
{"status":"success","message":null,"provider":"openai"}
Then ran the same request with an invalid key:
HTTP/1.1 200 OK
content-type: application/json
{"status":"auth_error","message":"litellm.BadRequestError: OpenAIException - invalid api key","provider":"openai"}
The mock provider observed the actual requests made by the server:
{"path":"/v1/chat/completions","authorization":"Bearer sk-valid","payload":{"messages":[{"content":[{"type":"text","text":"hi"}],"role":"user"}],"model":"qa-model","max_completion_tokens":1}}
{"path":"/v1/chat/completions","authorization":"Bearer sk-invalid","payload":{"messages":[{"content":[{"type":"text","text":"hi"}],"role":"user"}],"model":"qa-model","max_completion_tokens":1}}This shows the endpoint works end-to-end through the real agent-server → SDK → LiteLLM → provider HTTP path, including success and structured auth failure behavior.
Test 2: SDK LLM.verify() / LLM.averify() are available and exercise the provider path
Step 1 — Establish baseline without the PR:
On origin/main, ran:
from pydantic import SecretStr
from openhands.sdk.llm import LLM
llm = LLM(model='openai/qa-model', api_key=SecretStr('sk-valid'), base_url='http://127.0.0.1:18081/v1')
print('has_verify=', hasattr(llm, 'verify'))
try:
llm.verify()
except Exception as exc:
print(type(exc).__name__ + ':', str(exc)[:160])Observed:
has_verify= False
AttributeError: 'LLM' object has no attribute 'verify'
This confirms the SDK method did not exist on the base branch.
Step 2 — Apply the PR changes:
Checked out a326ca2bb5ae9b4ce2e3a3a79a7e4ae46bb22b01 with the same local provider stub running.
Step 3 — Re-run with the PR in place:
Ran:
import asyncio
from pydantic import SecretStr
from openhands.sdk.llm import LLM
llm = LLM(model='openai/qa-model', api_key=SecretStr('sk-valid'), base_url='http://127.0.0.1:18081/v1')
print('has_verify=', hasattr(llm, 'verify'))
print('sync_verify_result=', llm.verify())
print('async_verify_result=', asyncio.run(llm.averify()))Observed:
has_verify= True
sync_verify_result= None
async_verify_result= None
This confirms both SDK entry points are usable and complete successfully against a real OpenAI-compatible HTTP provider.
Test 3: Request validation remains a normal API contract error
With the PR server running, submitted an empty request body:
curl -sS -i -X POST http://127.0.0.1:18080/api/llm/verify -H 'Content-Type: application/json' -d '{}'Observed:
HTTP/1.1 422 Unprocessable Content
content-type: application/json
{"detail":[{"type":"missing","loc":["body","model"],"msg":"Field required","input":{}}]}
This confirms malformed client input still fails as an API contract error instead of being confused with provider verification status.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Code Review: POST /llm/verify endpoint
Solid implementation overall. The core design decisions are correct: always-HTTP-200 with a discriminated status field makes for a clean client-side decision tree, the bypass of retry and fallback strategy is intentional and well-tested, and SecretStr fields with extra="ignore" give a forward-compatible, safe API surface. Test coverage is thorough.
Two items worth addressing before merging, plus a couple of minor suggestions:
- Verify timeout: the probe inherits the LLM's 300 s default — too long for a user waiting on a credential-check button. Either hardcode a short timeout on the
LLMconstructed here, or expose it viaVerifyLLMRequest. See inline comment. - Provider error messages:
str(exc)is returned verbatim to the client; some providers embed raw credential fragments in error bodies. Worth a quick audit of what LiteLLM surfaces. See inline comment.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Coverage Report •
|
||||||||||||||||||||||||||||||
enyst
left a comment
There was a problem hiding this comment.
Sorry to post on a draft, just would like to point out a detail quick:
"Mirrors the LLM fields the settings UI exposes."
Maybe we can do differently here, like, use the full object, the full profile, or an existing object, instead of a bucket of special case settings?
I think what the agent did at this moment would be pretty bad design, sorry. The UI can change; and, it should not be the only UI we write a REST API for. (?!)
I believe we should not want it to be hardcoded - other people can create UIs and use this API. I would prefer to not limit UIs and customizability. I know technically this isn't a "limit", but it is: for agents it's a clear signal to dump partial settings everywhere.
Can we verify the LLM or profile that the user has?
@enyst , great call out. Will be fixed in a sec. |
Drop the hand-maintained VerifyLLMRequest subset and take the SDK's LLM model as the request body. Removes duplication with LLM, removes the field-by-field rebind (and the SecretStr-masking pitfall it papered over), and lets the endpoint grow new credential fields automatically whenever LLM does. Also fix a pre-existing bug in _sanitize_validation_errors that this refactor surfaces: Pydantic's value_error entries from @model_validator hooks include the original ValueError under ctx['error'], which is not JSON-serializable. Without jsonable_encoder the 422 handler itself raises a TypeError. The new test exercises that path directly. Tests: - Add parametrized AWS SecretStr passthrough test (review feedback) - Inline the SecretStr capture in the api_key passthrough test now that LLM always normalises to SecretStr via _validate_secrets - Add regression test for non-JSON-serializable ctx in the sanitizer Co-authored-by: openhands <openhands@all-hands.dev>
Addresses three open review threads on POST /llm/verify: 1. **Timeout cap.** Wrap llm.averify() in asyncio.wait_for with a hard 30 s ceiling (taken as min with any smaller caller-supplied timeout). The SDK's 300 s LLM.timeout default would otherwise park an interactive UI on a five-minute hang when a provider is unreachable. Maps asyncio.TimeoutError to VerifyLLMStatus.TIMEOUT. 2. **Error message sanitization.** New _sanitize_error_message scrubs any SecretStr values present on the constructed LLM (api_key plus the three AWS SecretStr fields) from the exception's str() representation, then truncates to 512 chars. Defends against providers that echo Authorization headers / query-string keys in error bodies, and bounds the response size against pathological HTML error pages. 3. **Telemetry comment in llm.py.** Explain that _telemetry_ctx is intentionally discarded in _prepare_verify_call so a future reader doesn't 'fix' it back to wiring telemetry callbacks for what is not an agent step. Tests cover all three behaviors: timeout enforcement, caller-supplied timeout honored when smaller, api_key scrubbing from error messages, and 512-char truncation with the ellipsis sentinel. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
❌ QA Report: FAIL
The new /api/llm/verify endpoint is present and handles some error paths, but it falsely rejects a real valid LLM configuration that can complete normally.
Does this PR achieve its stated goal?
Partially / No for the core success path. The PR sets out to add an LLM verify endpoint and SDK verify probe; I confirmed the endpoint did not exist on the base commit and exists on this PR. However, with the real configured LLM_MODEL / LLM_API_KEY / LLM_BASE_URL, the verify endpoint returned status=bad_request while a normal SDK completion using the exact same credentials succeeded and returned OK, so the feature can tell users valid credentials are invalid.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the workspace packages |
| CI Status | 🟡 All listed checks were successful except qa-changes, which was still in progress; cleanup-on-approval was skipped |
| Functional Verification | ❌ Real API + SDK execution found a false-negative verify result for a valid LLM config |
Functional Verification
Test 1: Baseline before this PR
Step 1 — Establish baseline without the feature:
Ran git checkout c6347949c4dacbdf9db364fc902e2be216599747, started the agent server with uv run python -m openhands.agent_server --port 8765, then called:
curl -X POST http://127.0.0.1:8765/api/llm/verify \
-H 'Content-Type: application/json' \
-d '{"model":"openai/gpt-4o","api_key":"sk-invalid"}'Observed:
HTTP_STATUS:404
{"detail":"Not Found"}
Also ran a short SDK import check and observed:
has_verify False
has_averify False
This establishes the baseline: the endpoint and SDK methods were not available before the PR.
Step 2 — Apply the PR changes:
Checked out feat/llm-verify-endpoint at f0af2c3b1e8c922af1a935d9f5ee617d57388aa5.
Test 2: PR endpoint behavior with validation and bad credentials
Started the PR agent server with uv run python -m openhands.agent_server --port 8765 and called GET /server_info; it returned HTTP 200 with the OpenHands Agent Server metadata.
Then called the new verify endpoint with a missing model:
curl -X POST http://127.0.0.1:8765/api/llm/verify \
-H 'Content-Type: application/json' \
-d '{"api_key":"sk-qa-secret-should-redact"}'Observed:
HTTP_STATUS:422
{"detail":[{"type":"value_error","loc":["body"],"msg":"Value error, model must be specified in LLM","input":{"api_key":"<redacted>"},"ctx":{"error":{}}}]}
This shows malformed input stays a JSON 422 and redacts the submitted API key instead of crashing or echoing the secret.
Then called the endpoint with the configured model/base URL but an intentionally invalid key:
curl -X POST http://127.0.0.1:8765/api/llm/verify \
-H 'Content-Type: application/json' \
--data-binary @/tmp/oh_invalid_payload.jsonObserved:
HTTP_STATUS:200
{"status":"auth_error","message":"litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed...","provider":"litellm_proxy"}
This shows the endpoint performs a real provider/proxy request and maps invalid credentials to structured HTTP 200 auth_error.
Test 3: PR false-negative on valid credentials
With the real configured LLM_MODEL, LLM_API_KEY, and LLM_BASE_URL, called the new endpoint as a real user would when verifying saved credentials:
curl -X POST http://127.0.0.1:8765/api/llm/verify \
-H 'Content-Type: application/json' \
--data-binary @/tmp/oh_valid_payload.jsonObserved:
HTTP_STATUS:200
{"status":"bad_request","message":"litellm.BadRequestError: Error code: 400 - {'error': {'message': \"litellm.BadRequestError: OpenAIException - Could not finish the message because max_tokens or model output limit was reached. Please try again with higher max_tokens...","provider":"litellm_proxy"}
To confirm the credentials/model/base URL were actually usable, I ran a normal SDK completion with the exact same environment-provided configuration and a realistic output budget:
uv run python -c "... llm.completion(..., max_completion_tokens=256) ..."Observed:
completion_succeeded True
assistant_text 'OK'
completion_tokens 16
I also exercised the new SDK method directly with the same valid config:
uv run python -c "... llm.verify() ..."Observed:
has_verify True has_averify True
verify_result exception
LLMBadRequestError
litellm.BadRequestError: Error code: 400 - {'error': {'message': "litellm.BadRequestError: OpenAIException - Could not finish the message because max_tokens or model output limit was reached...
Interpretation: the PR adds the endpoint/method, but the verify probe itself is too constrained for this valid reasoning-model configuration. The same credentials successfully complete normally, so returning bad_request from verify is a false negative.
Issues Found
- 🟠 Issue: Valid configured credentials/model/base URL were rejected by
/api/llm/verifyandLLM.verify()because the one-token probe hits a provider output-limit error, even though normal SDK completion with the same config succeeds.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Solid implementation overall. The POST /llm/verify design is well-considered: HTTP-200-always with a discriminated status enum is exactly the right contract for an interactive UI, and the security controls (timeout cap, credential scrubbing, error truncation) are all present. The bypass of retry and fallback is correctly motivated and tested. The jsonable_encoder fix for 422 responses is clean and targeted. Two issues worth addressing before merge:
_sanitize_error_messageduplicatesLLM_SECRET_FIELDS— the constant already declared as a single source of truth inllm.py(line 148) should be used here, not re-typed.infer_litellm_providersits outside thetryblock — if it raises on an unusual model string, the endpoint returns an unhandled 500 instead of a structured response, silently violating the documented HTTP-200 contract.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
|
@OpenHands fix failing sdk tests. |
|
I'm on it! chuckbutkus can track my progress at all-hands.dev |
…tion_tokens The verify probe was updated to use max_completion_tokens=1024 (instead of 1) for better compatibility with models that require a minimum token budget. Align the test assertion, test name, and docstring to match. Co-authored-by: openhands <openhands@all-hands.dev>
|
Fixed in c534f76. The failing test was This comment was created by an AI agent (OpenHands) on behalf of the user. |
|
OpenHands encountered an error: Request timeout after 30 seconds to https://kqxtrzjmjhxkkuaz.prod-runtime.all-hands.dev/api/conversations/c7177cce-6a99-4009-9625-735fa15157d5/ask_agent See the conversation for more information. |
|
@OpenHands address the unresolved all-hands-bot comments. |
|
I'm on it! chuckbutkus can track my progress at all-hands.dev |
- Use LLM_SECRET_FIELDS instead of hardcoded tuple in _sanitize_error_message so any future secret field added to LLM is automatically scrubbed - Move infer_litellm_provider inside the try block so a failure there returns a structured VerifyLLMResponse (HTTP 200) instead of a 500 - Add parametrized test for AWS credential scrubbing in error messages, covering aws_access_key_id, aws_secret_access_key, and aws_session_token Co-authored-by: openhands <openhands@all-hands.dev>
Summary of ChangesAll three unresolved Checklist
|
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — ready with two fixes before merge
This PR introduces a POST /llm/verify endpoint that accepts a full LLM config, fires a minimal one-token probe through averify(), and returns a discriminated-status HTTP 200 response. The overall architecture is sound: accepting LLM directly as the request body exercises the same code path as real conversation creation, the timeout cap and secret-scrubbing layers are well thought-out, and the test coverage is comprehensive. Previous review rounds caught and addressed the major issues (timeout cap, secret sanitization, hardcoded field list, infer_litellm_provider inside the try block). Two issues remain before this is merge-ready.
[CRITICAL ISSUES]
timeout=0 is silently promoted to _VERIFY_TIMEOUT_S
llm.timeout or _VERIFY_TIMEOUT_S uses Python's falsy check, so a caller passing "timeout": 0 gets a 30-second window instead of the intended immediate timeout. Fix with an explicit None guard:
timeout = min(
llm.timeout if llm.timeout is not None else _VERIFY_TIMEOUT_S,
_VERIFY_TIMEOUT_S,
)logger.exception emits the raw exception for unmapped errors
In the UNKNOWN_ERROR branch, logger.exception(...) has implicit exc_info=True, so the raw exception string — which may include provider-echoed content — lands in server logs unsanitized. The client gets the scrubbed message; the log does not. Risk is low since unmapped exceptions are rare, but for consistency with the rest of the hardening, use the already-scrubbed value:
logger.error("llm.verify failed with an unmapped exception: %s", message)[TESTING GAPS]
The PR description has no evidence section. Before removing the draft flag, add:
- The exact command used to exercise the verify endpoint against a real provider (e.g. local Ollama or a live OpenAI key)
- The resulting response body showing
"status": "success"(or"rate_limited", which counts as valid credentials)
Unit tests confirm the wiring is correct; they cannot substitute for proof that the real LiteLLM call round-trips to a live provider. This matters especially given that the previous review found a false-negative with the 1-token cap on certain proxies — confirming that max_completion_tokens=1024 works end-to-end with at least one real provider is the minimum bar.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
New public endpoint on the agent server that accepts credentials and makes outbound network calls. The always-200 contract, timeout cap, and secret-sanitization layers are all in place. The two issues above are low-probability in practice (callers won't send timeout=0; unmapped exceptions are rare), but should be fixed before merging given the security sensitivity of the endpoint's purpose.
VERDICT:
❌ Needs rework: Fix the timeout=0 falsy check and the unsanitized server-side log call before removing the draft flag.
KEY INSIGHT:
The or-based falsy check in the timeout calculation is the classic Python pitfall for optional numeric fields — an explicit is not None guard is correct whenever 0 is a meaningful value.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing. See the customization docs for the required frontmatter format.- Re-request a review — the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
Was this review helpful? React with 👍 or 👎 to give feedback.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new LLM verification flow by running the agent server and making real HTTP requests, plus calling the SDK LLM.verify() entrypoint with real credentials.
Does this PR achieve its stated goal?
Yes. The PR appears intended to add an LLM credential/connectivity verification endpoint and SDK probe; on main, POST /api/llm/verify returned 404, while on this PR the same real agent server accepted the endpoint and returned status: success using the configured real LLM credentials. I also verified structured failure behavior over HTTP (auth_error for an invalid token, 422 with redacted input for a malformed body) and confirmed direct LLM.verify() succeeds in the project uv environment.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests/linters were run. |
| CI Status | 🟡 At time checked: 31 passing, 0 failing, 1 pending (qa-changes), 3 skipped. |
| Functional Verification | ✅ Real server/API and SDK execution matched the intended behavior. |
Functional Verification
Test 1: New /api/llm/verify endpoint exists and verifies real credentials
Step 1 — Establish baseline without the PR:
Checked out main, started the real agent server, and posted to the verify endpoint:
git switch main
TMUX_TMPDIR=/tmp/oh-qa-tmux-main OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --host 127.0.0.1 --port 8765
curl -sS -o /tmp/oh-main-verify-body.txt -w '%{http_code}' -X POST http://127.0.0.1:8765/api/llm/verify -H 'Content-Type: application/json' --data '{"model":"openai/gpt-4o","api_key":"sk-fake"}'Observed:
MAIN_SERVER_READY_HTTP_STATUS=200
MAIN_VERIFY_HTTP_STATUS=404
MAIN_VERIFY_BODY:
{"detail":"Not Found"}
This confirms the endpoint was not available on main.
Step 2 — Apply the PR's changes:
Checked out feat/llm-verify-endpoint at f3fd93af048b169a4b1298564ccb97bb4fb1a90c and started the real agent server.
Step 3 — Re-run with the PR:
Posted a real verification request using $LLM_MODEL, $LLM_BASE_URL, and $LLM_API_KEY from the QA environment:
git switch feat/llm-verify-endpoint
TMUX_TMPDIR=/tmp/oh-qa-tmux-pr OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --host 127.0.0.1 --port 8766
curl -sS -o /tmp/oh-pr-verify-real-body.txt -w '%{http_code}' -X POST http://127.0.0.1:8766/api/llm/verify -H 'Content-Type: application/json' --data @/tmp/verify-real-payload.jsonObserved:
PR_SERVER_READY_HTTP_STATUS=200
PR_VERIFY_REAL_HTTP_STATUS=200
{"message": null, "provider": "litellm_proxy", "status": "success"}
This confirms the new endpoint is reachable and successfully exercises the real LLM credential path end-to-end.
Test 2: Verify endpoint returns structured failures instead of transport errors
Step 1 — Baseline:
On main, the endpoint is absent as shown in Test 1, so structured verify outcomes are not available.
Step 2 — Apply the PR's changes:
Used the PR branch server.
Step 3 — Exercise failure cases:
Sent an invalid API key with the configured model/base URL:
curl -sS -o /tmp/oh-pr-verify-invalid-body.txt -w '%{http_code}' -X POST http://127.0.0.1:8767/api/llm/verify -H 'Content-Type: application/json' --data @/tmp/verify-invalid-payload.jsonObserved:
PR_VERIFY_INVALID_HTTP_STATUS=200
{"message_contains_invalid_key": false, "message_excerpt": "litellm.AuthenticationError: AuthenticationError: Litellm_proxyException - Authentication Error, Invalid proxy server token passed. Received API Key = sk-...-ke", "provider": "litellm_proxy", "status": "auth_error"}
This confirms provider auth failure is represented as a structured status with HTTP 200, and the invalid key was not echoed verbatim.
Sent a malformed body omitting model:
curl -sS -o /tmp/oh-pr-verify-missing-model-body.txt -w '%{http_code}' -X POST http://127.0.0.1:8766/api/llm/verify -H 'Content-Type: application/json' --data '{"api_key":"sk-secret-that-should-be-redacted"}'Observed:
PR_VERIFY_MISSING_MODEL_HTTP_STATUS=422
{"detail":[{"type":"value_error","loc":["body"],"msg":"Value error, model must be specified in LLM","input":{"api_key":"<redacted>"},"ctx":{"error":{}}}]}
This confirms malformed requests still use FastAPI validation semantics while avoiding the validation-error serialization/redaction failure mode.
Test 3: SDK LLM.verify() works directly
Step 1 — Baseline:
The feature adds a new SDK probe method used by the server endpoint; the main-branch endpoint baseline above shows the user-facing verify flow did not exist before this PR.
Step 2 — Apply the PR's changes:
Used the PR branch in the repository uv environment.
Step 3 — Call the SDK directly with real credentials:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -c 'import os; from openhands.sdk.llm import LLM; kwargs={"model": os.environ["LLM_MODEL"], "api_key": os.environ["LLM_API_KEY"]}; base=os.environ.get("LLM_BASE_URL"); kwargs.update({"base_url": base} if base else {}); LLM(**kwargs).verify(); print("SDK_VERIFY_RESULT=success")'Observed:
SDK_VERIFY_RESULT=success
This confirms the SDK entrypoint performs a real provider probe successfully.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
|
@OpenHands address the unresolved all-hands-bot comments. |
|
I'm on it! chuckbutkus can track my progress at all-hands.dev |
Fix two issues flagged in the second review pass: 1. Timeout None check: replace 'llm.timeout or _VERIFY_TIMEOUT_S' with an explicit None check so timeout=0 is not falsily promoted to the 30 s cap. 2. Logger scrubbing: replace logger.exception with logger.error so the already-scrubbed message string is what gets recorded, rather than raw exception content via implicit exc_info=True. Co-authored-by: openhands <openhands@all-hands.dev>
|
Addressed the two remaining all-hands-bot review comments in 2967d88:
All 31 tests still pass. This comment was created by an AI agent (OpenHands) on behalf of the user. |
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new LLM verification flow by running the agent server, making real HTTP requests to /api/llm/verify, and directly calling the new SDK verify APIs.
Does this PR achieve its stated goal?
Yes. The PR appears intended to add a server-side LLM credential/connectivity verification endpoint backed by SDK LLM.verify/LLM.averify. On main, the same request returned 404 Not Found; on this PR, the endpoint returned structured verification results, successfully reached an OpenAI-compatible provider over HTTP, mapped an invalid real OpenAI credential to auth_error, and returned a sanitized 422 for malformed input instead of leaking the submitted key.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no test/lint suite was run. |
| CI Status | 🟡 Most checks are green; qa-changes and Docker image build/push jobs were still in progress when checked. |
| Functional Verification | ✅ Endpoint and SDK verify behavior worked with real server/API execution. |
Functional Verification
Test 1: Before/after server endpoint behavior
Step 1 — Establish baseline on origin/main:
Started the base agent server at f305353d4c369b4047e3cfd84e0d5f1dc48a6cab and ran:
curl -sS -w '\nHTTP_STATUS:%{http_code}\n' \
-H 'content-type: application/json' \
-d '{"model":"openai/gpt-4o","api_key":"good-key","base_url":"http://127.0.0.1:18111/v1","timeout":5}' \
http://127.0.0.1:18001/api/llm/verifyOutput:
{"detail":"Not Found"}
HTTP_STATUS:404This confirms the verify endpoint did not exist before the PR.
Step 2 — Apply the PR's changes:
Checked out feat/llm-verify-endpoint at 2967d88fa396bd8a9e59c2349d22fefb016c9b8b, started the agent server, and ran the same endpoint request against a local OpenAI-compatible provider.
Step 3 — Re-run with the PR in place:
curl -sS -w '\nHTTP_STATUS:%{http_code}\n' \
-H 'content-type: application/json' \
-d '{"model":"openai/gpt-4o","api_key":"good-key","base_url":"http://127.0.0.1:18111/v1","timeout":5}' \
http://127.0.0.1:18002/api/llm/verifyOutput:
{"status":"success","message":null,"provider":"openai"}
HTTP_STATUS:200The local provider log also showed the server made a real completion request:
127.0.0.1 - "POST /v1/chat/completions HTTP/1.1" 200 -
This shows the new endpoint is reachable and performs an actual provider call rather than only parsing input.
Test 2: Real invalid credential handling
Ran a real OpenAI verification attempt with an intentionally invalid key:
curl -sS -m 20 -w '\nHTTP_STATUS:%{http_code}\n' \
-H 'content-type: application/json' \
-d '{"model":"openai/gpt-4o","api_key":"sk-invalid-openhands-qa","timeout":10}' \
http://127.0.0.1:18002/api/llm/verifyOutput:
{"status":"auth_error","message":"litellm.AuthenticationError: AuthenticationError: OpenAIException - Incorrect API key provided: sk-inval***********s-qa. You can find your API key at https://platform.openai.com/account/api-keys.","provider":"openai"}
HTTP_STATUS:200This confirms provider authentication failures are surfaced as a structured auth_error response with HTTP 200, as a client/UI can branch on.
Test 3: Malformed body returns sanitized validation error
Ran:
curl -sS -w '\nHTTP_STATUS:%{http_code}\n' \
-H 'content-type: application/json' \
-d '{"api_key":"sk-real-secret-key-12345"}' \
http://127.0.0.1:18002/api/llm/verifyOutput:
{"detail":[{"type":"value_error","loc":["body"],"msg":"Value error, model must be specified in LLM","input":{"api_key":"<redacted>"},"ctx":{"error":{}}}]}
HTTP_STATUS:422This confirms malformed requests still fail as validation errors, but the submitted secret is redacted and the response remains JSON-serializable.
Test 4: SDK verify APIs execute directly
Ran a short user-style script importing the SDK and calling the new APIs:
uv run python - <<'PY'
import asyncio
from openhands.sdk.llm import LLM
from openhands.sdk.llm.exceptions import LLMAuthenticationError
llm = LLM(model='openai/gpt-4o', api_key='good-key', base_url='http://127.0.0.1:18111/v1', timeout=5)
print('sync verify:', llm.verify())
print('async verify:', asyncio.run(llm.averify()))
bad = LLM(model='openai/gpt-4o', api_key='sk-invalid-openhands-qa', timeout=10)
try:
bad.verify()
except LLMAuthenticationError as exc:
print('invalid OpenAI maps to:', type(exc).__name__)
PYOutput:
sync verify: None
async verify: None
invalid OpenAI maps to: LLMAuthenticationError
This confirms both sync and async SDK entry points work end-to-end and preserve typed authentication error behavior.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Review: feat/llm-verify-endpoint
This is a clean, security-conscious addition to the agent server. The decision to accept LLM directly as the request body (instead of a bespoke request model) keeps the code DRY and sidesteps the SecretStr-masking footgun that earlier revisions had with manual field rebinding. All critical issues from previous rounds — 30 s timeout cap, LLM_SECRET_FIELDS-based scrubbing, infer_litellm_provider inside the try block, timeout=0 handling, logger.error instead of logger.exception — are addressed.
Strengths:
- HTTP-200-always contract with discriminated
statusvalues is the right UX for a verify button. - Defence-in-depth credential scrubbing (
_sanitize_error_message+ truncation) is solid. - Bypassing retry and fallback strategy by design means the first real error surfaces immediately — exactly what "verify before save" needs.
_prepare_verify_callreuses the same normalisation path as a real completion (Azure rename, OpenRouter headers, etc.) so "verified" means the same code path will work at conversation time.- Test coverage is comprehensive: timeout, credential leakage, large error bodies, AWS credentials, keyless local servers,
unknown_errorpath.
Two small items below.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
Add a NOTE explaining that infer_litellm_provider sits inside the try block intentionally, so its failures collapse to UNKNOWN_ERROR alongside genuine provider-side unknown errors. Helps future readers reason about the two distinct failure modes hidden behind a single status. Co-authored-by: openhands <openhands@all-hands.dev>
|
✅ Review complete. This review was performed through OpenHands Cloud Automation. You can log in and view the conversation here. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new LLM verification capability through the SDK, the agent-server HTTP API, a local OpenAI-compatible provider, and the configured real LiteLLM proxy.
Does this PR achieve its stated goal?
Yes. The PR title indicates a new LLM verify endpoint, and the PR also adds the SDK probe it depends on. On origin/main, LLM.verify is absent and POST /api/llm/verify returns 404; on PR commit f19626cfd8f3654b27b45b044863cd3456a1be06, the SDK probe sends a real chat-completions request, the agent-server endpoint returns structured success, validation errors return 422 without leaking secrets, and a real configured LiteLLM proxy verification returned status: success.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and installed the editable packages. |
| CI Status | 🟡 Latest snapshot: 20 passed, 0 failed, 7 pending, 3 skipped. Pending jobs were QA and Docker image builds. |
| Functional Verification | ✅ SDK and API behavior worked end-to-end with local and real provider probes. |
Functional Verification
Test 1: Baseline confirms the feature did not exist before this PR
Step 1 — Establish baseline without the PR:
Ran:
git checkout --detach origin/main
uv run python /tmp/qa_baseline_probe.pyKey output:
BASE_SDK_HAS_VERIFY False
BASE_SDK_VERIFY_CALL AttributeError 'LLM' object has no attribute 'verify'
BASE_ENDPOINT_VERIFY_STATUS 404
BASE_ENDPOINT_VERIFY_BODY {"detail":"Not Found"}
This shows the SDK method and /api/llm/verify endpoint are new behavior in this PR.
Step 2 — Apply the PR changes:
Checked out:
git checkout --detach f19626cfd8f3654b27b45b044863cd3456a1be06Test 2: SDK LLM.verify() sends a real provider request and succeeds
Step 3 — Re-run with the PR:
Ran a QA driver that starts a local OpenAI-compatible HTTP server and calls LLM.verify():
uv run python /tmp/qa_verify_behavior.pyKey output:
PR_SDK_VERIFY_RESULT None
PR_SDK_FAKE_PROVIDER_REQUEST {"authorization_present": true, "first_role": "user", "messages_count": 1, "path": "/v1/chat/completions", "token_cap": 1024}
This confirms the SDK method executes a real HTTP completion probe, includes auth, sends one user message, caps output, and returns None on success.
Test 3: Agent-server /api/llm/verify works over HTTP
The same QA driver started agent-server --host 127.0.0.1 --port <free-port> and posted to the new endpoint with a local provider base_url.
Key output:
PR_ENDPOINT_LOCAL_STATUS 200
PR_ENDPOINT_LOCAL_BODY {"message": null, "provider": "openai", "status": "success"}
PR_ENDPOINT_FAKE_PROVIDER_REQUEST {"authorization_present": true, "first_role": "user", "messages_count": 1, "path": "/v1/chat/completions"}
This confirms a real HTTP client can call the new endpoint and the endpoint performs the verify probe through the provider-compatible HTTP path.
Test 4: Bad request path returns 422 and redacts secrets
Ran the same running agent server with a malformed verify payload containing an API key but no model.
Key output:
PR_ENDPOINT_MISSING_MODEL_STATUS 422
PR_ENDPOINT_MISSING_MODEL_REDACTED True
PR_ENDPOINT_MISSING_MODEL_BODY {"detail": [{"ctx": {"error": {}}, "input": {"api_key": "<redacted>"}, "loc": ["body"], "msg": "Value error, model must be specified in LLM", "type": "value_error"}]}
This confirms the validation path returns a client error instead of crashing and does not echo the submitted secret.
Test 5: Real configured provider credentials verify successfully
Using the existing environment (LLM_MODEL=litellm_proxy/openai/gpt-5.5, LLM_BASE_URL=https://llm-proxy.app.all-hands.dev, and injected LLM_API_KEY), the running agent server verified against the real LiteLLM proxy.
Key output:
PR_ENDPOINT_REAL_PROVIDER_STATUS 200
PR_ENDPOINT_REAL_PROVIDER_BODY {"message": null, "provider": "litellm_proxy", "status": "success"}
This is the strongest end-to-end evidence: the new endpoint verified real credentials through the same server path a client would use.
Test 6: Auth failures are structured and redacted
Ran a local rejecting OpenAI-compatible provider and called both SDK LLM.verify() and POST /api/llm/verify.
Key output:
PR_SDK_AUTH_FAILURE_EXCEPTION LLMAuthenticationError
PR_SDK_AUTH_FAILURE_IS_TYPED True
PR_ENDPOINT_AUTH_FAILURE_STATUS_CODE 200
PR_ENDPOINT_AUTH_FAILURE_BODY {"status":"auth_error","message":"litellm.BadRequestError: OpenAIException - invalid api key ***","provider":"openai"}
PR_ENDPOINT_AUTH_FAILURE_SECRET_REDACTED True
This confirms invalid credentials are mapped into the structured auth_error outcome and echoed provider text is scrubbed before reaching the client.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Review: feat/llm-verify-endpoint
The implementation is now in strong shape. The iterative review process has addressed all the critical and important issues from previous rounds — explicit None check for the timeout cap, LLM_SECRET_FIELDS-based secret scrubbing (imported rather than duplicated), infer_litellm_provider inside the try block, logger.error with the pre-sanitized message rather than logger.exception, and the correct model_copy guard that skips the allocation when provider is None. Accepting the full LLM object directly as the request body cleanly resolved the API design question raised by @enyst — any LiteLLM-supported configuration works without hard-coded field lists, and the SecretStr masking footgun is gone. Test coverage is thorough across all failure modes, timeout enforcement, retry/fallback bypass, and credential field scrubbing for both api_key and AWS fields.
One item worth tracking before broader exposure:
No request-level rate limit on POST /llm/verify: The endpoint fires a real paid LLM completion on every call. If the agent server becomes reachable by unauthenticated or lightly-authenticated callers (e.g. behind a shared API gateway), a tight loop of verify requests could exhaust provider quota or accumulate unexpected cost. Today the agent server is typically accessed only by authenticated sessions, so this is acceptable, but worth adding a per-session or per-IP throttle as a hardening follow-up before this pattern propagates to a broader UI surface. Inline comment on the handler below.
Everything else is clean and correct. The jsonable_encoder fix in _sanitize_validation_errors is a targeted, well-tested regression patch. The _prepare_verify_call approach of reusing _prepare_completion_params is smart — it means provider-specific option normalization (Azure max_tokens rename, OpenRouter headers, extended-thinking budgets) is exercised during verify, so "verified" genuinely means the agent will be able to use the credentials at conversation time.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation. View conversation
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new LLM verification capability by running the agent server and SDK against a live local OpenAI-compatible provider plus real error paths.
Does this PR achieve its stated goal?
Yes. The PR adds functional LLM verification behavior: the endpoint was absent on main but on this PR POST /api/llm/verify returned status=success after making a real /v1/chat/completions probe to a local OpenAI-compatible backend. The SDK-facing LLM.verify() and LLM.averify() methods also worked end-to-end, and provider/validation errors came back structured and redacted rather than leaking secrets or crashing.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests/linters were run. |
| CI Status | ⏳ At check time: 20 passing, 6 pending Build & Push jobs, 4 skipped. |
| Functional Verification | ✅ Endpoint, SDK methods, auth-error handling, and malformed-body validation were exercised with real requests. |
Functional Verification
Test 1: API endpoint exists and performs a real verify probe
Step 1 — Establish baseline without the fix:
Checked out origin/main, started the agent server, and posted to the new endpoint:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -m openhands.agent_server --host 127.0.0.1 --port 8910
curl -H 'Content-Type: application/json' \
-d '{"model":"openai/gpt-4o","api_key":"sk-baseline-placeholder"}' \
http://127.0.0.1:8910/api/llm/verify
baseline /api/llm/verify HTTP 404
{"detail":"Not Found"}
This confirms the feature did not exist on the base branch.
Step 2 — Apply the PR changes:
Checked out feat/llm-verify-endpoint at f19626c, started a local OpenAI-compatible /v1/chat/completions stub, then started the agent server.
Step 3 — Re-run with the fix in place:
Posted a realistic verify payload to the PR server:
curl -H 'Content-Type: application/json' \
-d '{"model":"openai/qa-local","base_url":"http://127.0.0.1:9912/v1","api_key":"sk-local-ok","timeout":5}' \
http://127.0.0.1:8911/api/llm/verify
HTTP codes:
success_local 200
success_local status= success provider= openai message_len= 0 secret_present= False redaction_present= False
stub request log:
{"path": "/v1/chat/completions", "model": "qa-local", "messages": [{"content": [{"type": "text", "text": "hi"}], "role": "user"}], "max_completion_tokens": 1024, "max_tokens": null, "auth_present": true}
This shows the endpoint is present, returns the documented structured success response, and actually sends the one-message completion probe to the provider backend.
Test 2: SDK LLM.verify() and LLM.averify() work for SDK users
Step 1 — Establish baseline without the fix:
On origin/main, imported LLM and checked method availability:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -c 'from openhands.sdk.llm import LLM; print("baseline_has_verify", hasattr(LLM, "verify")); print("baseline_has_averify", hasattr(LLM, "averify"))'
baseline_has_verify False
baseline_has_averify False
This confirms the SDK verification methods are new behavior.
Step 2 — Apply the PR changes:
Checked out the PR branch and reused the same local OpenAI-compatible provider.
Step 3 — Re-run with the fix in place:
Called both methods as an SDK user would:
OPENHANDS_SUPPRESS_BANNER=1 uv run python -c 'import asyncio; from openhands.sdk.llm import LLM; llm=LLM(model="openai/qa-local", base_url="http://127.0.0.1:9912/v1", api_key="sk-sdk-ok", timeout=5); print("verify_return", llm.verify()); print("averify_return", asyncio.run(llm.averify()))'
verify_return None
averify_return None
stub request count 2
This confirms both sync and async SDK entry points complete successfully and perform real provider calls.
Test 3: Error responses are structured and secrets are redacted
With the PR server running, I exercised auth and validation failures:
# Fake official OpenAI key path
auth_openai 200
auth_openai status= auth_error provider= openai message_len= 189 secret_present= False redaction_present= True
auth_openai_message= litellm.AuthenticationError: AuthenticationError: OpenAIException - Incorrect API key provided: sk-inval*****-key. You can find your API key at https://platform.openai.com/account/api-keys.
# OpenAI-like local 401 body
http 200 status auth_error provider openai secret_present False redacted True message litellm.AuthenticationError: AuthenticationError: OpenAIException - Incorrect API key provided: Bearer ***
# Malformed body missing model
missing_model 422
detail_type list detail_count 1
first_type value_error loc ['body'] input {'api_key': '<redacted>'} ctx_keys ['error']
This shows verify-time failures stay on HTTP 200 with a structured status, while malformed request bodies remain 422 and redact secret inputs.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
| # ``timeout`` to a smaller value in the request body; we take the | ||
| # ``min`` so they can never extend it past what is a reasonable | ||
| # interactive wait. | ||
| _VERIFY_TIMEOUT_S = 30.0 |
| # client. Caps any accidental large payload (truncated HTML error pages, | ||
| # verbose stack traces, etc.) and bounds the worst case if a provider were | ||
| # ever to echo request content back in an error body. | ||
| _MAX_ERROR_MESSAGE_CHARS = 512 |
| # ───────────────────────────────────────────────────────────────────────────── | ||
|
|
||
|
|
||
| class VerifyLLMStatus(str, Enum): |
| ``UNKNOWN_ERROR`` so the endpoint never raises and the frontend always | ||
| has a structured result to branch on. | ||
| """ | ||
| message = _sanitize_error_message(exc, llm) |
There was a problem hiding this comment.
Can we just not add a class method to VerifyLLMStatus?
Something like VerifyLLMStatus.from_exception?
| the Azure branch of :func:`select_chat_options` renames it to | ||
| ``max_tokens=1024`` automatically. | ||
| """ | ||
| probe = [Message(role="user", content=[TextContent(text="hi")])] |
There was a problem hiding this comment.
Do we want to do that?
Almost every provider has an auth-check endpoint that costs nothing:
┌────────────────────────────────────┬───────────────────────────────────────────────────────────────────┐
│ Provider │ Free probe │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ OpenAI │ GET /v1/models │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ Anthropic │ GET /v1/models (added ~2024) │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ Google Gemini │ GET /v1/models │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ Azure OpenAI │ GET /openai/deployments?api-version=… │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ Bedrock │ sts:GetCallerIdentity or bedrock:ListFoundationModels │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ OpenRouter │ GET /api/v1/auth/key (also returns credit balance — bonus UX win) │
├────────────────────────────────────┼───────────────────────────────────────────────────────────────────┤
│ Mistral / Cohere / Together / Groq │ all have list-models │
└────────────────────────────────────┴───────────────────────────────────────────────────────────────────┘
There was a problem hiding this comment.
but actually I don't know if it checks that you have credits.
This would be just to check that you have a valide API KEY.
Anyway, just an idea. I don't want to block the PR
Why
The agent-server needs a way to let a client (e.g. a "Verify credentials"
button in the GUI) confirm an
LLMconfiguration is valid before it issaved or used to start a conversation. Verifying client-side is brittle:
api_versionrequire provider-specific requestshapes the client would have to re-implement.
conversation time, so "verified in the browser" wouldn't reliably mean
"the agent will be able to use this".
This PR adds a small, structured probe — at both the SDK and HTTP layer —
that fixes all three problems.
Summary
LLM.verify()/LLM.averify()send a single 1-token probethrough the existing completion code path and surface the first provider
error as a typed SDK exception via
map_provider_exception. Retry andfallback_strategyare deliberately bypassed so verify fails fast on themodel the caller actually intends to save.
POST /api/llm/verifyaccepts anLLMbody andreturns a discriminated
VerifyLLMResponse(success,auth_error,rate_limited,timeout,unreachable,bad_request,unknown_error).Always returns HTTP 200 — clients branch on
status, not transporterrors — so the UI has a single decision tree. The probe is hard-capped
at 30 s (vs the SDK's 300 s default) and the inferred LiteLLM provider
name is attributed even on failure so the UI can render "Anthropic auth
failed" rather than an unattributed error.
SecretStrvaluespresent on the
LLM(api_key,aws_access_key_id,aws_secret_access_key,aws_session_token) and truncated to 512 charsto defend against providers that echo credentials back or return large
HTML error pages. A small fix in
api.pyruns the 422 sanitizer outputthrough
jsonable_encoderso Pydantic's non-JSON-serializablectx['error'](raised byLLM's@model_validator(mode='before'))doesn't itself turn a 422 into a 500.
Issue Number
None.
How to Test
status:uv run pytest tests/sdk/llm/test_llm_verify.py \ tests/agent_server/test_llm_router.py \ tests/agent_server/test_validation_error_sanitization.pyVideo/Screenshots
N/A — backend-only endpoint; behavior is exercised by the curl examples
and tests above.
Type
Notes
POST /api/llm/verifyis mounted under the/api/*group and inherits the existing
X-Session-API-Keyheader dependency,so it is not anonymously reachable in deployments that configure session
keys. Credentials are caller-supplied per request — the agent server
never carries a shared upstream key on this path.
on_request/on_responsecallbacks; they are not agent steps andshould not contribute to spend tracking, prompt logs, or user-facing
metrics.
auth failure; falling back to a different LLM would silently report
"success" for a model that isn't the one being saved. Both defeat the
point of "verify before save".
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:0310711-pythonRun
All tags pushed for this build
About Multi-Architecture Support
0310711-python) is a multi-arch manifest supporting both amd64 and arm640310711-python-amd64) are also available if needed