You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
docs: update architecture with Telegram document images, model alias mapping, Vision tier fix, LLM fallback
- Telegram bot now handles document images preserving EXIF/C2PA metadata
- api.dos.ai Worker maps model aliases (FP8, GPTQ, qwen3.5-35b) to dos-ai
- Google Vision reverse search prioritizes exact/partial over visually similar
- LLM fallback chain via LLM_FALLBACK_PROVIDERS env var with cost logging
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: DOSafe-Architecture.md
+9-2Lines changed: 9 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -105,6 +105,8 @@ Written to by: DOS-Me Trust API (`/trust/flags/:id/attest`). Read by: DOSafe (`d
105
105
106
106
Both models are natively multimodal (text + image). Auth: `INTERNAL_API_KEY` via `api.dos.ai` gateway (bypasses billing). Fallback: Alibaba Cloud `qwen3.5-flash` when vLLM is unavailable.
107
107
108
+
**Model alias mapping:** The `api.dos.ai` Worker maps common model aliases to the served model name. `Qwen/Qwen3.5-35B-A3B-FP8`, `Qwen/Qwen3.5-35B-A3B-GPTQ-Int4`, and `qwen3.5-35b` all resolve to `dos-ai` (the vLLM `--served-model-name`). Clients don't need to know the exact deployed model variant.
Welcome: "DOSafe — Vệ Sĩ AI Của Bạn" / "Your AI Bodyguard"
492
+
493
+
**Document image handling:** The bot accepts images sent as documents (original quality files), not just compressed photos. This preserves EXIF and C2PA metadata which Telegram's photo compression strips, enabling accurate image provenance detection.
@@ -710,6 +715,8 @@ The V1 URL pipeline used simple keyword matching (`hasScamTerms()`) on web searc
710
715
711
716
**Performance:** Web search runs parallel with Phase 1 (DB/on-chain/DOS.Me). Only LLM analysis waits for Phase 1 results. Total added latency: ~3–5s for full path (skipped on extension fast path).
712
717
718
+
**LLM Fallback:** When self-hosted vLLM is unavailable, falls back to Alibaba Cloud `qwen3.5-flash` (or other providers). Multi-provider fallback chain configured via `LLM_FALLBACK_PROVIDERS` env var (JSON array of `{name, baseUrl, apiKey, model}` entries). Fallback usage is logged as structured JSON (`event: llm_fallback_used`) with token count and cost estimate for internal cost monitoring.
719
+
713
720
---
714
721
715
722
## Database Layout
@@ -823,7 +830,7 @@ DOS_ME_TRUST_API_KEY=...
823
830
-**C2PA:** Cryptographic content credentials (~40% of AI images have C2PA in 2026)
824
831
-**EXIF:** Camera metadata + AI tool detection in Software field
825
832
-**DCT:** JPEG quantization table analysis (camera-specific vs AI generic)
826
-
-**Reverse search:** Google Cloud Vision WEB_DETECTION → Serper Lens fallback
833
+
-**Reverse search:** Google Cloud Vision WEB_DETECTION → Serper Lens fallback. Vision returns 3 tiers: `fullMatchingImages` (exact), `partialMatchingImages` (cropped/resized), `pagesWithMatchingImages` (visually similar). Prioritizes exact/partial matches; only falls back to visually similar if none found. Bot only displays exact/partial matches as "sources found" to avoid misleading citations.
827
834
-**LLM visual:** Multimodal rubric analysis — Qwen3.5-35B (natively multimodal, no separate VL model needed)
0 commit comments