Skip to content

Commit 0d5944c

Browse files
JOYclaude
andcommitted
docs: add DOSafe Partner API reference (DOS Shield gateway)
- Partner-API.md: full API reference for /v1/check, /v1/check/bulk, /v1/report - Signal weights table, migration guide from DOS.Me Trust API - DOSafe-Extension-Architecture.md, DOSafe-Mobile-Architecture.md: add existing architecture docs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 0a742ce commit 0d5944c

3 files changed

Lines changed: 932 additions & 0 deletions

File tree

DOSafe-Extension-Architecture.md

Lines changed: 378 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,378 @@
1+
# DOSafe Extension Architecture
2+
3+
Last updated: 2026-03-07
4+
5+
## 1) Scope
6+
7+
This document defines the architecture for the DOSafe Chrome Extension located at:
8+
9+
- `apps/extension`
10+
11+
Current goals:
12+
13+
- Read content from the active page (selected text or full visible page text).
14+
- Run DOSafe text detection via `/api/detect`.
15+
- Crawl structured page data and send it to a configurable ingestion server.
16+
17+
## 2) Current Implementation (MVP)
18+
19+
Extension type:
20+
21+
- Chrome Extension Manifest V3 popup-based extension.
22+
23+
Main files:
24+
25+
- `apps/extension/manifest.json`
26+
- `apps/extension/popup.html`
27+
- `apps/extension/popup.css`
28+
- `apps/extension/popup.js`
29+
30+
Key permissions:
31+
32+
- `activeTab`
33+
- `scripting`
34+
- `storage`
35+
- `host_permissions`: `https://*/*`, `http://*/*` (plus DOSafe and localhost explicit entries)
36+
37+
Current UI actions:
38+
39+
- `Scan`: Send crawled text to DOSafe detector endpoint.
40+
- `Crawl to server`: Collect page data and POST JSON payload to user-defined crawl endpoint.
41+
- `Save`: Persist settings in `chrome.storage.local`.
42+
43+
## 3) Data Flows
44+
45+
### 3.1 Scan flow (AI text detection)
46+
47+
1. User opens popup and clicks `Scan`.
48+
2. Extension executes script in active tab:
49+
- `selectedText = window.getSelection()`
50+
- fallback: `document.body.innerText`
51+
3. Truncate text by `maxChars` (500-5000).
52+
4. POST to detector endpoint (default: `https://dosafe.io/api/detect`) with:
53+
- `{ text, lang }`
54+
5. Render response fields:
55+
- `ai_probability`, `human_probability`, `verdict`, `confidence`, `signals`, `sentence_scores`
56+
57+
### 3.2 Crawl flow (page ingestion)
58+
59+
1. User sets `crawlEndpoint` and clicks `Crawl to server`.
60+
2. Extension executes script in active tab and extracts:
61+
- URL metadata: `url`, `title`, `lang`, `capturedAt`
62+
- Content: `selectedText`, `text`, `fullTextLength`
63+
- SEO/content hints: `metaDescription`, top `h1` headings
64+
- Commerce hints: `priceCandidates` (regex-based candidate extraction)
65+
3. POST to crawl endpoint with payload:
66+
67+
```json
68+
{
69+
"source": "dosafe-extension",
70+
"page": {
71+
"url": "https://example.com/...",
72+
"title": "...",
73+
"lang": "vi",
74+
"selectedText": "...",
75+
"text": "...",
76+
"fullTextLength": 12345,
77+
"metaDescription": "...",
78+
"headings": ["..."],
79+
"priceCandidates": ["153.513₫"],
80+
"capturedAt": "2026-03-03T..."
81+
},
82+
"detector_hint": {
83+
"lang": "vi"
84+
}
85+
}
86+
```
87+
88+
4. Popup shows server response JSON.
89+
90+
## 4) Capability Mapping vs Browser-Agent Tooling
91+
92+
Some AI agents expose tools named `find`, `read_page`, `get_page_text`, `computer`, `javascript_tool`, etc.
93+
The extension can implement equivalent capabilities as follows:
94+
95+
- `get_page_text` equivalent:
96+
- `document.body.innerText` (already implemented)
97+
- `find` equivalent:
98+
- DOM scanning and keyword/regex matching (`querySelectorAll`, `textContent`, price regex)
99+
- `read_page` equivalent:
100+
- Full DOM traversal (can be added with `TreeWalker`) and ARIA/role extraction
101+
- `javascript_tool` equivalent:
102+
- Executing page-context JS via `chrome.scripting.executeScript`
103+
- `computer` (screenshot) equivalent:
104+
- `chrome.tabs.captureVisibleTab` (not yet implemented)
105+
- `read_network_requests` equivalent:
106+
- Requires debugger/webRequest strategy (not yet implemented)
107+
108+
Important distinction:
109+
110+
- Browser agents can run broad automation sessions.
111+
- Product extension should keep deterministic, user-triggered, auditable behavior.
112+
113+
## 5) Security and Privacy Model
114+
115+
User-trigger model:
116+
117+
- Page reading happens when user explicitly clicks `Scan` or `Crawl to server`.
118+
119+
Stored data:
120+
121+
- Stored locally in `chrome.storage.local`:
122+
- detector endpoint, crawl endpoint, language, max chars
123+
124+
Sensitive data handling:
125+
126+
- The extension currently sends only extracted page data the user triggers.
127+
- No background continuous crawling or passive page exfiltration.
128+
129+
Risk notes:
130+
131+
- `https://*/*` and `http://*/*` host permissions are broad for ingestion flexibility.
132+
- For production hardening, consider allowlist-based domains and signed request auth.
133+
134+
## 6) Known Constraints
135+
136+
- Cannot access restricted Chrome pages (`chrome://`, extensions store internals, etc.).
137+
- Cross-origin iframes may not be fully readable unless frame execution is explicitly handled.
138+
- Dynamic/canvas-heavy pages may expose limited text through DOM extraction.
139+
- Regex price extraction is heuristic, not schema-guaranteed.
140+
141+
## 7) Recommended Server Contract (Crawl Endpoint)
142+
143+
Endpoint:
144+
145+
- `POST /api/crawl` (example)
146+
147+
Required behaviors:
148+
149+
- Validate payload size and schema.
150+
- Attach server-side timestamps and request metadata.
151+
- Deduplicate by `(url, capturedAt window, content hash)`.
152+
- Return JSON `{ ok: true, id: "...", receivedAt: "..." }`.
153+
154+
Recommended auth:
155+
156+
- `Authorization: Bearer <extension-token>` or HMAC signature.
157+
- Rotate tokens and enforce rate limits per installation/device.
158+
159+
## 8) Available Backend API Endpoints
160+
161+
The DOSafe backend (`https://dosafe.io`) exposes these APIs that the extension should integrate:
162+
163+
### 8.1 `/api/detect` (AI + Scam Text Detection) — ALREADY INTEGRATED
164+
165+
```
166+
POST /api/detect
167+
Body: { "text": "...", "lang": "vi", "task": "ai_detection" | "scam_detection" }
168+
Response: { ai_probability, human_probability, verdict, confidence, signals, sentence_scores, source_matches }
169+
```
170+
171+
### 8.2 `/api/url-check` (URL/Domain Risk Assessment) — NOT YET INTEGRATED
172+
173+
Check any URL for phishing/malware/scam risks. Combines DB lookup + Google Safe Browsing + WHOIS + on-chain flags.
174+
175+
```
176+
POST /api/url-check
177+
Body: { "url": "https://suspicious-site.com" }
178+
Response: {
179+
riskLevel: "safe" | "low" | "medium" | "high" | "critical",
180+
riskSignals: ["Domain registered 3 days ago", "Found in MetaMask blacklist"],
181+
checks: {
182+
trustedDomain: { isTrusted, domain },
183+
safeBrowsing: { isSafe, threats },
184+
whois: { domainAge, createdDate, registrar },
185+
onChain: { flags },
186+
threatIntel: { entries, maxRiskScore, sources, categories, cluster }
187+
}
188+
}
189+
```
190+
191+
**Extension integration idea:** Auto-check the current tab's URL in background when user navigates. Show badge icon (green/yellow/red) based on riskLevel.
192+
193+
### 8.3 `/api/entity-check` (Phone/Email/Wallet/Domain/Bank Account Check) — NOT YET INTEGRATED
194+
195+
Check individual entities across threat DB + on-chain.
196+
197+
```
198+
POST /api/entity-check
199+
Body: { "entityType": "phone" | "email" | "wallet" | "domain" | "bank_account" | ..., "entityId": "0912345678" }
200+
Response: {
201+
entityType, entityId,
202+
riskLevel: "safe" | "low" | "medium" | "high" | "critical",
203+
riskSignals: [...],
204+
threatIntel: { entries, maxRiskScore, sources, categories, cluster },
205+
onChain: { flags }
206+
}
207+
```
208+
209+
**Supported entity types:** phone, email, wallet, url, domain, bank_account, national_id, facebook, telegram, organization
210+
211+
**Extension integration idea:** Detect phone numbers, bank accounts, wallet addresses in page content. Highlight or annotate them with risk indicators.
212+
213+
### 8.4 `/api/detect-image` (Image AI Detection) — NOT YET INTEGRATED
214+
215+
```
216+
POST /api/detect-image
217+
Body: FormData with image file
218+
Response: { ai_probability, verdict, confidence, signals }
219+
```
220+
221+
## 9) Roadmap
222+
223+
Phase 1 (current — COMPLETE):
224+
225+
- Text extraction + detector scan + server crawl POST from popup.
226+
- Facebook content script: capture post text + author profile.
227+
- Side panel UI with AI mode / Scam mode tabs.
228+
229+
Phase 2 (NEXT — real-time protection):
230+
231+
- **URL auto-check:** Call `/api/url-check` on navigation. Show risk badge on extension icon (green/yellow/red).
232+
- **Entity detection:** Scan page content for phone numbers, bank accounts, wallet addresses. Check via `/api/entity-check`.
233+
- **Inline annotations:** Highlight detected entities with risk indicators (tooltip with source count, risk level).
234+
- Add `contextMenus` action: "Check this with DOSafe" for selected text/links.
235+
236+
Phase 3:
237+
238+
- Add structured field extraction profiles (e-commerce/article/forum).
239+
- Add optional screenshot capture for evidence snapshots.
240+
- Add network-aware mode (capture API responses used by page).
241+
- Add frame-aware crawling and pagination helpers.
242+
- Add retry queue/offline sync (background service worker + local queue).
243+
244+
## 9) Testing Checklist
245+
246+
Manual:
247+
248+
- Load unpacked extension from `apps/extension`.
249+
- Scan selected text (`>= 50 chars`) on `https://dosafe.io` and a third-party page.
250+
- Crawl to test ingestion endpoint and verify payload integrity.
251+
252+
Technical:
253+
254+
- `manifest.json` parses correctly.
255+
- `popup.js` passes syntax check.
256+
- CORS and endpoint auth validated on ingestion server.
257+
258+
## 10) Claude-Style Tactics and Tools (Captured Notes)
259+
260+
The following reflects the workflow style described by Claude-like browser agents and how to apply it in DOSafe extension development:
261+
262+
- `find` tactic:
263+
- First locate target elements semantically (price, title, seller, rating) instead of fixed selectors.
264+
- In extension implementation, emulate with keyword scoring over DOM text blocks and selector fallbacks.
265+
- `read_page` tactic:
266+
- Read broad page structure first (headings, regions, forms, interactive elements), then extract targeted fields.
267+
- In extension implementation, add structured DOM traversal mode (TreeWalker + role/aria map).
268+
- `get_page_text` tactic:
269+
- Extract full text body for broad understanding; then do targeted extraction.
270+
- Already partially implemented via `document.body.innerText`.
271+
- `javascript_tool` tactic:
272+
- Execute in-page JS for high-fidelity extraction from dynamic apps (React/Vue state, globals, inline JSON).
273+
- In extension implementation, use `chrome.scripting.executeScript`.
274+
- `computer` interaction tactic:
275+
- Use click/scroll/type only when DOM extraction is insufficient.
276+
- For extension product, keep interaction optional and user-triggered for predictability.
277+
- `read_network_requests` tactic:
278+
- Observe XHR/fetch responses to get clean structured data directly from site APIs.
279+
- Recommended for Phase 3 with debugger/webRequest-backed mode.
280+
- `tabs` tactic:
281+
- Run extraction on multiple tabs in parallel for throughput.
282+
- Extension roadmap: queue jobs per tab and bounded concurrency.
283+
- `screenshot` tactic:
284+
- Capture evidence when text/DOM is incomplete or disputed.
285+
- Extension roadmap: `chrome.tabs.captureVisibleTab`.
286+
287+
Execution heuristics from Claude-style operation:
288+
289+
- Read first, act second: snapshot structure before interacting.
290+
- Prefer stable references (DOM path/semantic ref) over pixel coordinates.
291+
- Combine DOM signals + network signals for highest accuracy.
292+
- Run tasks in parallel where independent (multiple tabs/pages).
293+
- Keep extraction deterministic; use UI automation only as fallback.
294+
295+
## 11) Additional Research-Backed Techniques for DOSafe Crawler
296+
297+
These are practical techniques researched and recommended for robust production crawling via extension:
298+
299+
- Multi-layer extraction pipeline:
300+
- Layer 1: selected text.
301+
- Layer 2: visible body text.
302+
- Layer 3: structured fields (title, meta, headings, price candidates).
303+
- Layer 4: API/network payload capture (optional advanced mode).
304+
- E-commerce specific parsing:
305+
- Normalize localized prices (`153.513₫`, `153,513 VND`, `$12.99`).
306+
- Extract currency, numeric value, and confidence per candidate.
307+
- Content deduplication:
308+
- Hash normalized text and key fields before upload.
309+
- Skip duplicate sends within a short time window per URL.
310+
- Quality scoring:
311+
- Attach `extraction_confidence` from field completeness + text length + source type.
312+
- Anti-fragile parsing:
313+
- Use multiple selectors and regex routes; avoid single brittle selectors.
314+
- Gracefully degrade to text-only payload when structure fails.
315+
- Privacy-safe default:
316+
- Redact obvious PII patterns before upload (email, phone, card-like numbers).
317+
- Add optional domain allowlist mode for enterprise deployments.
318+
- Reliability:
319+
- Add retry with exponential backoff for crawl uploads.
320+
- Add local queue when offline, flush later from background worker.
321+
- Observability:
322+
- Include `capture_id`, timing metrics, and parser version in payload.
323+
- Keep server-side audit trail for each crawl event.
324+
325+
Recommended next implementation slice:
326+
327+
- Add parser profiles: `generic`, `ecommerce`, `article`, `social`.
328+
- Add payload schema versioning (`schema_version: 1`).
329+
- Add authenticated ingestion (Bearer/HMAC) and replay protection.
330+
331+
## 12) Facebook Profile Crawl Limits and Browser-Agent Direction
332+
333+
Current verified behavior in pps/extension:
334+
335+
- content-facebook.js binds actions only to visible Facebook rticle nodes on the current page.
336+
- Profile Check sends uthorName, profileUrl, and postText to the background worker.
337+
- ackground.js opens the main profileUrl in a new tab and extracts only:
338+
- document.title
339+
- first h1
340+
- meta[name="description"]
341+
- current document.body.innerText
342+
- It does not automatically click into About, Friends, Photos, or Posts tabs.
343+
- It does not run a multi-step observe -> decide -> act loop.
344+
345+
Implication:
346+
347+
- Current profile scraping is a single-page capture, not a full browser agent.
348+
- Data that is hidden behind profile tabs, lazy-loading, or interaction gates is not collected.
349+
- This is acceptable for lightweight context capture, but not sufficient for robust scam-profile investigation.
350+
351+
Recommended browser-agent pattern for DOSafe:
352+
353+
- Use the model as planner only. In current stack, this means Qwen3.5 emits structured actions.
354+
- Use the extension runtime as executor via:
355+
- chrome.tabs
356+
- chrome.scripting.executeScript
357+
- chrome.storage
358+
- abs.onUpdated / wait logic
359+
- Keep a tight loop:
360+
1. Observe current page state
361+
2. Model returns next allowed action
362+
3. Runtime executes action
363+
4. Runtime returns compact observation
364+
- Restrict actions by whitelist and block destructive actions by default.
365+
366+
Suggested next extension scope:
367+
368+
- Add Facebook profile deep crawl sequence:
369+
- main profile
370+
- About
371+
- Posts
372+
- Photos
373+
- Add bounded step execution and timeout guards.
374+
- Return structured observations with:
375+
- current URL
376+
- visible tabs
377+
- extracted text summary
378+
- entities found (phones, links, ank_accounts, domains)

0 commit comments

Comments
 (0)