Skip to content

Commit e7e2916

Browse files
JOYclaude
andcommitted
docs: major update - models, DOSafe detection, changelog, smart routing
- Fix changelog.md case (CHANGELOG.md -> changelog.md) for GitBook Linux - Add 2026-04-09 to 2026-04-12 changelog entries (DOSRouter, cache-aware routing, logout fix) - Update model catalog: add Llama 4 Maverick/Scout, dos-auto smart routing, embeddings - Update pricing to include all current models - Update DOSafe overview: add video/audio detection, face/voice verification endpoints - Update DOSafe partner-api: add detect-video, detect-audio endpoints with schemas, update data source stats (1.2M -> 3.93M, 11 -> 19 scrapers) - Rewrite README: add DOSClaw agents, DOSafe section, smart routing, full model table Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 3139326 commit e7e2916

6 files changed

Lines changed: 194 additions & 31 deletions

File tree

README.md

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,13 @@
11
# DOS AI
22

3-
**Fast, affordable AI inference for open-source models.**
3+
**Fast, affordable AI inference and agent platform for open-source models.**
44

5-
DOS AI is an inference platform that lets you run leading open-source language models through a simple, OpenAI-compatible API. No GPU management, no infrastructure headaches -- just an API key and a few lines of code.
5+
DOS AI is an inference platform that lets you run leading open-source language models through a simple, OpenAI-compatible API. Deploy AI agents with DOSClaw, protect your users with DOSafe, and route intelligently with smart model selection -- all from a single platform.
66

77
## Why DOS AI?
88

99
- **OpenAI-compatible** -- Swap your base URL and you're done. Works with the OpenAI Python SDK, Node.js SDK, LangChain, LlamaIndex, and any HTTP client.
10+
- **Smart routing** -- Use `dos-auto` to let our 15-dimension classifier pick the best model for each request automatically.
1011
- **Low latency** -- Models served on dedicated GPUs with optimized inference (vLLM). No cold starts, no queues.
1112
- **Pay-as-you-go** -- Only pay for the tokens you use. Every new account gets **$5 in free credits** to get started.
1213
- **Open-source models** -- Access the best open-source models without managing your own infrastructure.
@@ -24,7 +25,7 @@ client = OpenAI(
2425
)
2526

2627
response = client.chat.completions.create(
27-
model="dos-ai",
28+
model="dos-auto", # Smart routing picks the best model
2829
messages=[
2930
{"role": "user", "content": "Explain quantum computing in one paragraph."}
3031
],
@@ -33,24 +34,58 @@ response = client.chat.completions.create(
3334
print(response.choices[0].message.content)
3435
```
3536

36-
## Documentation
37+
## Platform
38+
39+
### LLM Inference API
40+
41+
OpenAI-compatible API with smart routing, streaming, function calling, and structured outputs.
3742

3843
| Section | Description |
3944
| --- | --- |
4045
| [Quickstart](getting-started/quickstart.md) | Create an account, get an API key, and make your first request |
4146
| [Authentication](getting-started/authentication.md) | API key management, rate limits, and security best practices |
47+
| [Available Models](models/available-models.md) | Full model catalog with pricing |
4248
| [OpenAI Compatibility](getting-started/openai-compatibility.md) | Migration guide and compatibility details |
4349

50+
### DOSClaw Agents
51+
52+
Deploy AI agents powered by [OpenClaw](https://github.com/nicejoy/openclaw) with Telegram, Discord, and WhatsApp integration. Each agent runs in its own container with web search, memory, video/music generation, and 5,000+ installable skills.
53+
54+
- Create agents from the [dashboard](https://app.dos.ai/agents)
55+
- Choose from templates: Personal Assistant, Sales, Customer Support, Content Creator
56+
- Credit-based pricing with a free trial
57+
58+
### DOSafe
59+
60+
Safety and threat intelligence engine with AI detection capabilities.
61+
62+
| Feature | Description |
63+
| --- | --- |
64+
| [Entity/URL Check](dosafe/overview.md) | Risk assessment against 3.93M+ threat intelligence entries |
65+
| [AI Text Detection](dosafe/partner-api.md) | Detect AI-generated text |
66+
| [AI Image Detection](dosafe/partner-api.md) | Detect AI-generated or manipulated images |
67+
| [AI Video Detection](dosafe/partner-api.md) | 7-layer pipeline for AI video detection |
68+
| [AI Audio Detection](dosafe/partner-api.md) | Detect AI-generated speech and voice clones |
69+
| [Face/Voice Verification](dosafe/partner-api.md) | Liveness detection and biometric matching |
70+
4471
## Available models
4572

46-
| Model ID | Base model | Context length | Pricing |
73+
| Model ID | Base model | Context | Pricing |
4774
| --- | --- | --- | --- |
48-
| `dos-ai` | Qwen3.5-35B-A3B | 32,768 tokens | See [dashboard](https://app.dos.ai) |
75+
| `dos-auto` | Smart routing (auto-select) | varies | varies |
76+
| `dos-ai` | Qwen3.5-35B-A3B | 128K | $0.15 / 1M tokens |
77+
| `llama-4-maverick` | Llama 4 Maverick 17B-128E | 1M | $0.17 / 1M input |
78+
| `llama-4-scout` | Llama 4 Scout 17B-16E | 640K | $0.11 / 1M input |
79+
| `deepseek-v3` | DeepSeek V3 | 128K | $0.25 / 1M tokens |
80+
| `llama-3.3-70b` | Llama 3.3 70B | 128K | $0.20 / 1M tokens |
81+
| `llama-3.1-8b` | Llama 3.1 8B | 128K | $0.05 / 1M tokens |
4982

50-
More models are added regularly. Check the [models endpoint](https://api.dos.ai/v1/models) or your dashboard for the latest list.
83+
More models are added regularly. Check the [catalog endpoint](https://api.dos.ai/v1/catalog) or the [dashboard](https://app.dos.ai/models) for the latest list.
5184

5285
## Links
5386

5487
- **Dashboard**: [app.dos.ai](https://app.dos.ai)
5588
- **API base URL**: `https://api.dos.ai/v1`
89+
- **DOSafe**: [dosafe.io](https://dosafe.io)
5690
- **Status**: [status.dos.ai](https://status.dos.ai)
91+
- **Community**: [Telegram](https://t.me/dosai_community) | [Discord](https://discord.gg/dosai)

CHANGELOG.md renamed to changelog.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,18 @@ Products: `dosclaw`, `dashboard`, `gateway`, `dosafe`, `inference`
99

1010
---
1111

12+
## 2026-04-12
13+
14+
- **feature** [gateway] Cache-Aware Sticky Routing -- DOSRouter pins model to session when context exceeds 3K tokens (single message) or 5K tokens (cumulative) to maximize provider-side prefix cache hits; sticky TTL is per-provider (5min for API providers, 10min for self-hosted vLLM)
15+
- **feature** [gateway] Per-Provider Cache TTL -- Sticky routing TTL matches each provider's prefix cache lifetime: Anthropic/OpenAI/DeepSeek (5 min), vLLM/self-hosted (10 min); configurable via `providerCacheTTLMs` map
16+
- **fix** [dashboard] Cross-Account Logout Loop -- Logout now passes `prompt=login` to id.dos.me to force login form display instead of auto-SSO, preventing cross-account session loops
17+
18+
## 2026-04-11
19+
20+
- **feature** [gateway] DOSRouter Upstream Sync to v0.12.146 -- 17/19 ClawRouter releases ported; includes usage cost breakdown, eco/premium tier fallback, session pinning, agentic 3-state, model roster updates
21+
- **feature** [gateway] DOSRouter Full Port Expansion -- Wallet module (EVM + Solana), payment module (x402 protocol), image generation endpoint, full CLI (serve, classify, models, stats, logs, cache, report, wallet, chain, doctor)
22+
- **feature** [gateway] DOSRouter Open-Sourced -- Standalone Go LLM router at github.com/DOS/DOSRouter with 15-dimension scoring, tier-based routing, structured fallback chains
23+
1224
## 2026-04-08
1325

1426
- **feature** [dosclaw] OpenClaw v2026.4.5 — Major engine upgrade with video/music generation, enhanced memory, and improved channel experience

dosafe/overview.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,9 @@ DOSafe is the safety and threat intelligence engine for the DOS ecosystem. It ag
99
- **URL check** -- Analyze a URL for phishing, scam, and malware indicators.
1010
- **AI text detection** -- Determine whether a piece of text was generated by AI.
1111
- **AI image detection** -- Determine whether an image was generated or manipulated by AI.
12+
- **AI video detection** -- Analyze video for AI-generated content using a 7-layer pipeline (frame analysis, temporal consistency, audio-visual sync, LLM visual reasoning).
13+
- **AI audio detection** -- Detect AI-generated speech and voice clones using BEATs + mHuBERT ensemble (AUROC 0.88).
14+
- **Face verification** -- Liveness detection and face matching for identity verification.
1215

1316
## Supported Entity Types
1417

@@ -33,6 +36,12 @@ All DOSafe endpoints use the base URL `https://api.dos.ai/v1/dosafe`.
3336
| POST | `/v1/dosafe/url-check` | URL/domain safety check |
3437
| POST | `/v1/dosafe/detect` | AI text detection |
3538
| POST | `/v1/dosafe/detect-image` | AI image detection |
39+
| POST | `/v1/dosafe/detect-video` | AI video detection |
40+
| POST | `/v1/dosafe/detect-audio` | AI audio/voice detection |
41+
| POST | `/v1/dosafe/face/enroll` | Face enrollment for verification |
42+
| POST | `/v1/dosafe/face/verify` | Face liveness + match verification |
43+
| POST | `/v1/dosafe/voice/enroll` | Voice enrollment for speaker ID |
44+
| POST | `/v1/dosafe/voice/verify` | Voice speaker verification |
3645

3746
## Authentication
3847

dosafe/partner-api.md

Lines changed: 59 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,16 @@
88

99
## Overview
1010

11-
The DOSafe API is the unified safety gateway for the DOS ecosystem. A single API key grants access to all DOSafe services — entity/URL safety checks, AI text/image detection, and community reporting — with scopes controlling which capabilities are available.
11+
The DOSafe API is the unified safety gateway for the DOS ecosystem. A single API key grants access to all DOSafe services — entity/URL safety checks, AI text/image/video/audio detection, face and voice verification, and community reporting — with scopes controlling which capabilities are available.
1212

1313
### Data Sources (Safety Check)
1414

1515
| Source | Weight | Description |
1616
|--------|--------|-------------|
17-
| DOSafe DB | Highest | 1.2M+ entries from 11 scrapers (phishing, scam, malware, wallets) |
17+
| DOSafe DB | Highest | 3.93M+ entries from 19 scrapers (phishing, scam, malware, wallets) |
1818
| DOS Chain | High | Immutable on-chain attestations via EAS |
1919
| DOS.Me Identity | Moderate | Member trust score, verified providers, flagged status |
20+
| Web Analysis | Moderate | Real-time web search + LLM-powered risk analysis |
2021

2122
**Architecture:** DOSafe is the safety engine and public gateway. DOS.Me is an identity data provider — external services call DOSafe, not DOS.Me.
2223

@@ -49,9 +50,11 @@ Keys are stored as SHA-256 hashes in `dosafe.api_keys`. Plaintext is never persi
4950
| `check` | `POST /check` |
5051
| `bulk` | `POST /check/bulk` |
5152
| `report` | `POST /report` |
52-
| `detect` | `POST /detect`, `POST /detect-image` |
53+
| `detect` | `POST /detect`, `POST /detect-image`, `POST /detect-video`, `POST /detect-audio` |
5354
| `url-check` | `POST /url-check` |
5455
| `entity-check` | `POST /entity-check` |
56+
| `face` | `POST /face/enroll`, `POST /face/verify` |
57+
| `voice` | `POST /voice/enroll`, `POST /voice/verify` |
5558

5659
A key can have multiple scopes. Contact the DOSafe team to provision a key with required scopes.
5760

@@ -273,6 +276,59 @@ AI image detection. Combines C2PA, EXIF/DCT metadata, reverse image search, and
273276

274277
---
275278

279+
### `POST /detect-video`
280+
281+
**Scope:** `detect`
282+
283+
AI video detection. Uses a 7-layer pipeline: frame-level AI detection, temporal consistency analysis, audio-visual synchronization, and LLM visual reasoning.
284+
285+
**Request:** `multipart/form-data` with `video` field (MP4/MOV/WEBM, max 100MB), or JSON `{ "url": "..." }`.
286+
287+
**Response:**
288+
```json
289+
{
290+
"aiProbability": 78,
291+
"verdict": "AI",
292+
"confidence": "medium",
293+
"signals": {
294+
"frameAnalysis": 0.82,
295+
"temporalConsistency": 0.71,
296+
"audioSync": 0.65,
297+
"llmVisual": 0.85
298+
},
299+
"framesAnalyzed": 24,
300+
"duration": 15.2
301+
}
302+
```
303+
304+
---
305+
306+
### `POST /detect-audio`
307+
308+
**Scope:** `detect`
309+
310+
AI audio/voice detection. BEATs + mHuBERT ensemble for detecting AI-generated speech and voice clones.
311+
312+
**Request:** `multipart/form-data` with `audio` field (WAV/MP3/OGG/FLAC, max 50MB), or JSON `{ "url": "..." }`.
313+
314+
**Response:**
315+
```json
316+
{
317+
"aiProbability": 91,
318+
"verdict": "AI",
319+
"confidence": "high",
320+
"signals": {
321+
"beats": 0.93,
322+
"mhubert": 0.89,
323+
"ensemble": 0.91
324+
},
325+
"hasSpeech": true,
326+
"duration": 8.5
327+
}
328+
```
329+
330+
---
331+
276332
### `POST /url-check`
277333

278334
**Scope:** `url-check`

models/available-models.md

Lines changed: 66 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,69 @@
11
# Available Models
22

3-
DOS AI serves high-quality open-source LLMs via an OpenAI-compatible API. All models run on dedicated RTX Pro 6000 GPUs with 96 GB VRAM, ensuring fast inference and low latency from our Asia-Southeast 1 region.
3+
DOS AI serves high-quality open-source LLMs via an OpenAI-compatible API. Self-hosted models run on dedicated RTX Pro 6000 GPUs with 96 GB VRAM in Asia-Southeast 1. Cloud models are served via partner providers for maximum coverage.
4+
5+
## Smart Routing
6+
7+
Use `dos-auto` as the model ID to let DOS AI automatically select the best model for each request. Smart routing uses a 15-dimension classifier to analyze your prompt and route to the optimal model based on task complexity, cost, and latency.
8+
9+
```python
10+
response = client.chat.completions.create(
11+
model="dos-auto", # Smart routing picks the best model
12+
messages=[{"role": "user", "content": "..."}],
13+
)
14+
```
415

516
## Model Catalog
617

7-
| Model | Provider | Type | Context Window | Input Price | Output Price |
8-
| ----- | -------- | ---- | -------------- | ----------- | ------------ |
9-
| **Qwen3.5-35B-A3B** | Alibaba | Chat | 128K tokens | $0.15 / 1M tokens | $0.15 / 1M tokens |
10-
| **Llama 3.3 70B** | Meta | Chat | 128K tokens | $0.20 / 1M tokens | $0.20 / 1M tokens |
11-
| **DeepSeek V3** | DeepSeek | Chat | 128K tokens | $0.25 / 1M tokens | $0.25 / 1M tokens |
12-
| **Llama 3.1 8B** | Meta | Chat | 128K tokens | $0.05 / 1M tokens | $0.05 / 1M tokens |
18+
### Self-Hosted (Lowest Latency)
19+
20+
| Model | Provider | Context | Input | Output | Model ID |
21+
| ----- | -------- | ------- | ----- | ------ | -------- |
22+
| **Qwen3.5-35B-A3B** | Alibaba | 128K | $0.15 / 1M | $0.15 / 1M | `dos-ai` |
23+
24+
### Cloud Models
25+
26+
| Model | Provider | Context | Input | Output | Model ID |
27+
| ----- | -------- | ------- | ----- | ------ | -------- |
28+
| **Llama 4 Maverick 17B-128E** | Meta / DeepInfra | 1M | $0.17 / 1M | $0.66 / 1M | `llama-4-maverick` |
29+
| **Llama 4 Scout 17B-16E** | Meta / DeepInfra | 640K | $0.11 / 1M | $0.38 / 1M | `llama-4-scout` |
30+
| **DeepSeek V3** | DeepSeek | 128K | $0.25 / 1M | $0.25 / 1M | `deepseek-v3` |
31+
| **Llama 3.3 70B** | Meta | 128K | $0.20 / 1M | $0.20 / 1M | `llama-3.3-70b` |
32+
| **Llama 3.1 8B** | Meta | 128K | $0.05 / 1M | $0.05 / 1M | `llama-3.1-8b` |
1333

14-
> All prices are in USD. See [Pricing](pricing.md) for details on billing, free tier, and volume discounts.
34+
> All prices are in USD. The catalog is DB-driven -- new models are added regularly. Check `GET /v1/catalog` or the [dashboard](https://app.dos.ai/models) for the latest list. See [Pricing](pricing.md) for billing details.
35+
36+
### Embedding Models
37+
38+
| Model | Provider | Dimensions | Model ID |
39+
| ----- | -------- | ---------- | -------- |
40+
| **Qwen3-Embedding-4B AWQ** | Alibaba / Self-hosted | 2560 | `qwen3-embedding-4b` |
1541

1642
## Model Details
1743

18-
### Qwen3.5-35B-A3B
44+
### Qwen3.5-35B-A3B (default)
1945

2046
Alibaba's Mixture-of-Experts model with 35 billion total parameters and 3 billion active parameters per forward pass. This architecture delivers excellent quality at remarkably low cost and latency, making it our **recommended default model** for most use cases.
2147

2248
- **Best for**: General-purpose chat, code generation, reasoning, multilingual tasks
2349
- **Strengths**: Outstanding cost-efficiency, fast response times, strong multilingual support (especially CJK languages)
2450
- **Model ID**: `dos-ai`
2551

26-
### Llama 3.3 70B
52+
### Llama 4 Maverick 17B-128E
2753

28-
Meta's flagship 70-billion-parameter dense model. Offers top-tier reasoning and instruction-following capabilities.
54+
Meta's latest Mixture-of-Experts model with 17 billion active parameters and 128 experts. Strong reasoning and multilingual capabilities with an industry-leading 1 million token context window.
2955

30-
- **Best for**: Complex reasoning, long-form content, detailed analysis
31-
- **Strengths**: Strong English performance, excellent instruction following, robust safety tuning
32-
- **Model ID**: `llama-3.3-70b`
56+
- **Best for**: Complex reasoning, long-context analysis, multilingual tasks
57+
- **Strengths**: Massive context window, strong benchmark scores, efficient MoE architecture
58+
- **Model ID**: `llama-4-maverick`
59+
60+
### Llama 4 Scout 17B-16E
61+
62+
Meta's efficient MoE model with 17 billion active parameters and 16 experts. Fast and cost-effective for everyday tasks with a 640K context window.
63+
64+
- **Best for**: Everyday tasks, fast responses, cost-sensitive workloads
65+
- **Strengths**: Good balance of speed and quality, large context window
66+
- **Model ID**: `llama-4-scout`
3367

3468
### DeepSeek V3
3569

@@ -39,6 +73,14 @@ DeepSeek's latest Mixture-of-Experts model, known for strong performance across
3973
- **Strengths**: Competitive benchmark scores, good at structured/JSON output, strong code capabilities
4074
- **Model ID**: `deepseek-v3`
4175

76+
### Llama 3.3 70B
77+
78+
Meta's 70-billion-parameter dense model. Offers top-tier reasoning and instruction-following capabilities.
79+
80+
- **Best for**: Complex reasoning, long-form content, detailed analysis
81+
- **Strengths**: Strong English performance, excellent instruction following, robust safety tuning
82+
- **Model ID**: `llama-3.3-70b`
83+
4284
### Llama 3.1 8B
4385

4486
Meta's efficient 8-billion-parameter model. An excellent choice when you need fast, affordable responses and the task does not require the full capability of a larger model.
@@ -51,11 +93,13 @@ Meta's efficient 8-billion-parameter model. An excellent choice when you need fa
5193

5294
| Use Case | Recommended Model | Why |
5395
| -------- | ----------------- | --- |
96+
| Let DOS AI decide | `dos-auto` | Smart routing picks the best model per request |
5497
| General assistant / chatbot | Qwen3.5-35B-A3B | Best balance of quality, speed, and cost |
55-
| Complex analysis / long documents | Llama 3.3 70B | Strongest reasoning for demanding tasks |
98+
| Long-context analysis (100K+ tokens) | Llama 4 Maverick | 1M context window, strong reasoning |
99+
| Complex reasoning / analysis | Llama 3.3 70B | Dense model, top reasoning capability |
56100
| Code generation / math | DeepSeek V3 | Top coding and math benchmark scores |
57101
| High-volume / low-cost tasks | Llama 3.1 8B | Fastest and cheapest option |
58-
| Multilingual (especially Asian languages) | Qwen3.5-35B-A3B | Superior CJK language performance |
102+
| Multilingual (CJK languages) | Qwen3.5-35B-A3B | Superior CJK language performance |
59103

60104
## Listing Models via API
61105

@@ -66,8 +110,11 @@ curl https://api.dos.ai/v1/models \
66110
-H "Authorization: Bearer YOUR_API_KEY"
67111
```
68112

69-
See the [Models API reference](../api-reference/models.md) for the full response schema.
113+
For the full retail catalog with pricing and metadata:
70114

71-
## Coming Soon
115+
```bash
116+
curl https://api.dos.ai/v1/catalog \
117+
-H "Authorization: Bearer YOUR_API_KEY"
118+
```
72119

73-
We are continuously evaluating and adding new models. Upcoming additions may include vision models, embedding models, and larger reasoning models. Check back regularly or follow our announcements for updates.
120+
See the [Models API reference](../api-reference/models.md) for the full response schema.

models/pricing.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,11 +21,15 @@ Pricing is calculated per **1 million tokens** (both input and output).
2121

2222
| Model | Input Price (per 1M tokens) | Output Price (per 1M tokens) |
2323
| ----- | --------------------------- | ---------------------------- |
24-
| **Qwen3.5-35B-A3B** | $0.15 | $0.15 |
25-
| **Llama 3.3 70B** | $0.20 | $0.20 |
24+
| **Qwen3.5-35B-A3B** (default) | $0.15 | $0.15 |
25+
| **Llama 4 Maverick 17B-128E** | $0.17 | $0.66 |
26+
| **Llama 4 Scout 17B-16E** | $0.11 | $0.38 |
2627
| **DeepSeek V3** | $0.25 | $0.25 |
28+
| **Llama 3.3 70B** | $0.20 | $0.20 |
2729
| **Llama 3.1 8B** | $0.05 | $0.05 |
2830

31+
> Prices are DB-driven and may be updated. Check the [dashboard](https://app.dos.ai/models) or `GET /v1/catalog` for the latest pricing.
32+
2933
### What is a Token?
3034

3135
A token is roughly 3-4 characters of English text, or about 0.75 words. For example:

0 commit comments

Comments
 (0)