|
| 1 | +# DOSafe Extension Architecture |
| 2 | + |
| 3 | +Last updated: 2026-03-07 |
| 4 | + |
| 5 | +## 1) Scope |
| 6 | + |
| 7 | +This document defines the architecture for the DOSafe Chrome Extension located at: |
| 8 | + |
| 9 | +- `apps/extension` |
| 10 | + |
| 11 | +Current goals: |
| 12 | + |
| 13 | +- Read content from the active page (selected text or full visible page text). |
| 14 | +- Run DOSafe text detection via `/api/detect`. |
| 15 | +- Crawl structured page data and send it to a configurable ingestion server. |
| 16 | + |
| 17 | +## 2) Current Implementation (MVP) |
| 18 | + |
| 19 | +Extension type: |
| 20 | + |
| 21 | +- Chrome Extension Manifest V3 popup-based extension. |
| 22 | + |
| 23 | +Main files: |
| 24 | + |
| 25 | +- `apps/extension/manifest.json` |
| 26 | +- `apps/extension/popup.html` |
| 27 | +- `apps/extension/popup.css` |
| 28 | +- `apps/extension/popup.js` |
| 29 | + |
| 30 | +Key permissions: |
| 31 | + |
| 32 | +- `activeTab` |
| 33 | +- `scripting` |
| 34 | +- `storage` |
| 35 | +- `host_permissions`: `https://*/*`, `http://*/*` (plus DOSafe and localhost explicit entries) |
| 36 | + |
| 37 | +Current UI actions: |
| 38 | + |
| 39 | +- `Scan`: Send crawled text to DOSafe detector endpoint. |
| 40 | +- `Crawl to server`: Collect page data and POST JSON payload to user-defined crawl endpoint. |
| 41 | +- `Save`: Persist settings in `chrome.storage.local`. |
| 42 | + |
| 43 | +## 3) Data Flows |
| 44 | + |
| 45 | +### 3.1 Scan flow (AI text detection) |
| 46 | + |
| 47 | +1. User opens popup and clicks `Scan`. |
| 48 | +2. Extension executes script in active tab: |
| 49 | + - `selectedText = window.getSelection()` |
| 50 | + - fallback: `document.body.innerText` |
| 51 | +3. Truncate text by `maxChars` (500-5000). |
| 52 | +4. POST to detector endpoint (default: `https://dosafe.io/api/detect`) with: |
| 53 | + - `{ text, lang }` |
| 54 | +5. Render response fields: |
| 55 | + - `ai_probability`, `human_probability`, `verdict`, `confidence`, `signals`, `sentence_scores` |
| 56 | + |
| 57 | +### 3.2 Crawl flow (page ingestion) |
| 58 | + |
| 59 | +1. User sets `crawlEndpoint` and clicks `Crawl to server`. |
| 60 | +2. Extension executes script in active tab and extracts: |
| 61 | + - URL metadata: `url`, `title`, `lang`, `capturedAt` |
| 62 | + - Content: `selectedText`, `text`, `fullTextLength` |
| 63 | + - SEO/content hints: `metaDescription`, top `h1` headings |
| 64 | + - Commerce hints: `priceCandidates` (regex-based candidate extraction) |
| 65 | +3. POST to crawl endpoint with payload: |
| 66 | + |
| 67 | +```json |
| 68 | +{ |
| 69 | + "source": "dosafe-extension", |
| 70 | + "page": { |
| 71 | + "url": "https://example.com/...", |
| 72 | + "title": "...", |
| 73 | + "lang": "vi", |
| 74 | + "selectedText": "...", |
| 75 | + "text": "...", |
| 76 | + "fullTextLength": 12345, |
| 77 | + "metaDescription": "...", |
| 78 | + "headings": ["..."], |
| 79 | + "priceCandidates": ["153.513₫"], |
| 80 | + "capturedAt": "2026-03-03T..." |
| 81 | + }, |
| 82 | + "detector_hint": { |
| 83 | + "lang": "vi" |
| 84 | + } |
| 85 | +} |
| 86 | +``` |
| 87 | + |
| 88 | +4. Popup shows server response JSON. |
| 89 | + |
| 90 | +## 4) Capability Mapping vs Browser-Agent Tooling |
| 91 | + |
| 92 | +Some AI agents expose tools named `find`, `read_page`, `get_page_text`, `computer`, `javascript_tool`, etc. |
| 93 | +The extension can implement equivalent capabilities as follows: |
| 94 | + |
| 95 | +- `get_page_text` equivalent: |
| 96 | + - `document.body.innerText` (already implemented) |
| 97 | +- `find` equivalent: |
| 98 | + - DOM scanning and keyword/regex matching (`querySelectorAll`, `textContent`, price regex) |
| 99 | +- `read_page` equivalent: |
| 100 | + - Full DOM traversal (can be added with `TreeWalker`) and ARIA/role extraction |
| 101 | +- `javascript_tool` equivalent: |
| 102 | + - Executing page-context JS via `chrome.scripting.executeScript` |
| 103 | +- `computer` (screenshot) equivalent: |
| 104 | + - `chrome.tabs.captureVisibleTab` (not yet implemented) |
| 105 | +- `read_network_requests` equivalent: |
| 106 | + - Requires debugger/webRequest strategy (not yet implemented) |
| 107 | + |
| 108 | +Important distinction: |
| 109 | + |
| 110 | +- Browser agents can run broad automation sessions. |
| 111 | +- Product extension should keep deterministic, user-triggered, auditable behavior. |
| 112 | + |
| 113 | +## 5) Security and Privacy Model |
| 114 | + |
| 115 | +User-trigger model: |
| 116 | + |
| 117 | +- Page reading happens when user explicitly clicks `Scan` or `Crawl to server`. |
| 118 | + |
| 119 | +Stored data: |
| 120 | + |
| 121 | +- Stored locally in `chrome.storage.local`: |
| 122 | + - detector endpoint, crawl endpoint, language, max chars |
| 123 | + |
| 124 | +Sensitive data handling: |
| 125 | + |
| 126 | +- The extension currently sends only extracted page data the user triggers. |
| 127 | +- No background continuous crawling or passive page exfiltration. |
| 128 | + |
| 129 | +Risk notes: |
| 130 | + |
| 131 | +- `https://*/*` and `http://*/*` host permissions are broad for ingestion flexibility. |
| 132 | +- For production hardening, consider allowlist-based domains and signed request auth. |
| 133 | + |
| 134 | +## 6) Known Constraints |
| 135 | + |
| 136 | +- Cannot access restricted Chrome pages (`chrome://`, extensions store internals, etc.). |
| 137 | +- Cross-origin iframes may not be fully readable unless frame execution is explicitly handled. |
| 138 | +- Dynamic/canvas-heavy pages may expose limited text through DOM extraction. |
| 139 | +- Regex price extraction is heuristic, not schema-guaranteed. |
| 140 | + |
| 141 | +## 7) Recommended Server Contract (Crawl Endpoint) |
| 142 | + |
| 143 | +Endpoint: |
| 144 | + |
| 145 | +- `POST /api/crawl` (example) |
| 146 | + |
| 147 | +Required behaviors: |
| 148 | + |
| 149 | +- Validate payload size and schema. |
| 150 | +- Attach server-side timestamps and request metadata. |
| 151 | +- Deduplicate by `(url, capturedAt window, content hash)`. |
| 152 | +- Return JSON `{ ok: true, id: "...", receivedAt: "..." }`. |
| 153 | + |
| 154 | +Recommended auth: |
| 155 | + |
| 156 | +- `Authorization: Bearer <extension-token>` or HMAC signature. |
| 157 | +- Rotate tokens and enforce rate limits per installation/device. |
| 158 | + |
| 159 | +## 8) Available Backend API Endpoints |
| 160 | + |
| 161 | +The DOSafe backend (`https://dosafe.io`) exposes these APIs that the extension should integrate: |
| 162 | + |
| 163 | +### 8.1 `/api/detect` (AI + Scam Text Detection) — ALREADY INTEGRATED |
| 164 | + |
| 165 | +``` |
| 166 | +POST /api/detect |
| 167 | +Body: { "text": "...", "lang": "vi", "task": "ai_detection" | "scam_detection" } |
| 168 | +Response: { ai_probability, human_probability, verdict, confidence, signals, sentence_scores, source_matches } |
| 169 | +``` |
| 170 | + |
| 171 | +### 8.2 `/api/url-check` (URL/Domain Risk Assessment) — NOT YET INTEGRATED |
| 172 | + |
| 173 | +Check any URL for phishing/malware/scam risks. Combines DB lookup + Google Safe Browsing + WHOIS + on-chain flags. |
| 174 | + |
| 175 | +``` |
| 176 | +POST /api/url-check |
| 177 | +Body: { "url": "https://suspicious-site.com" } |
| 178 | +Response: { |
| 179 | + riskLevel: "safe" | "low" | "medium" | "high" | "critical", |
| 180 | + riskSignals: ["Domain registered 3 days ago", "Found in MetaMask blacklist"], |
| 181 | + checks: { |
| 182 | + trustedDomain: { isTrusted, domain }, |
| 183 | + safeBrowsing: { isSafe, threats }, |
| 184 | + whois: { domainAge, createdDate, registrar }, |
| 185 | + onChain: { flags }, |
| 186 | + threatIntel: { entries, maxRiskScore, sources, categories, cluster } |
| 187 | + } |
| 188 | +} |
| 189 | +``` |
| 190 | + |
| 191 | +**Extension integration idea:** Auto-check the current tab's URL in background when user navigates. Show badge icon (green/yellow/red) based on riskLevel. |
| 192 | + |
| 193 | +### 8.3 `/api/entity-check` (Phone/Email/Wallet/Domain/Bank Account Check) — NOT YET INTEGRATED |
| 194 | + |
| 195 | +Check individual entities across threat DB + on-chain. |
| 196 | + |
| 197 | +``` |
| 198 | +POST /api/entity-check |
| 199 | +Body: { "entityType": "phone" | "email" | "wallet" | "domain" | "bank_account" | ..., "entityId": "0912345678" } |
| 200 | +Response: { |
| 201 | + entityType, entityId, |
| 202 | + riskLevel: "safe" | "low" | "medium" | "high" | "critical", |
| 203 | + riskSignals: [...], |
| 204 | + threatIntel: { entries, maxRiskScore, sources, categories, cluster }, |
| 205 | + onChain: { flags } |
| 206 | +} |
| 207 | +``` |
| 208 | + |
| 209 | +**Supported entity types:** phone, email, wallet, url, domain, bank_account, national_id, facebook, telegram, organization |
| 210 | + |
| 211 | +**Extension integration idea:** Detect phone numbers, bank accounts, wallet addresses in page content. Highlight or annotate them with risk indicators. |
| 212 | + |
| 213 | +### 8.4 `/api/detect-image` (Image AI Detection) — NOT YET INTEGRATED |
| 214 | + |
| 215 | +``` |
| 216 | +POST /api/detect-image |
| 217 | +Body: FormData with image file |
| 218 | +Response: { ai_probability, verdict, confidence, signals } |
| 219 | +``` |
| 220 | + |
| 221 | +## 9) Roadmap |
| 222 | + |
| 223 | +Phase 1 (current — COMPLETE): |
| 224 | + |
| 225 | +- Text extraction + detector scan + server crawl POST from popup. |
| 226 | +- Facebook content script: capture post text + author profile. |
| 227 | +- Side panel UI with AI mode / Scam mode tabs. |
| 228 | + |
| 229 | +Phase 2 (NEXT — real-time protection): |
| 230 | + |
| 231 | +- **URL auto-check:** Call `/api/url-check` on navigation. Show risk badge on extension icon (green/yellow/red). |
| 232 | +- **Entity detection:** Scan page content for phone numbers, bank accounts, wallet addresses. Check via `/api/entity-check`. |
| 233 | +- **Inline annotations:** Highlight detected entities with risk indicators (tooltip with source count, risk level). |
| 234 | +- Add `contextMenus` action: "Check this with DOSafe" for selected text/links. |
| 235 | + |
| 236 | +Phase 3: |
| 237 | + |
| 238 | +- Add structured field extraction profiles (e-commerce/article/forum). |
| 239 | +- Add optional screenshot capture for evidence snapshots. |
| 240 | +- Add network-aware mode (capture API responses used by page). |
| 241 | +- Add frame-aware crawling and pagination helpers. |
| 242 | +- Add retry queue/offline sync (background service worker + local queue). |
| 243 | + |
| 244 | +## 9) Testing Checklist |
| 245 | + |
| 246 | +Manual: |
| 247 | + |
| 248 | +- Load unpacked extension from `apps/extension`. |
| 249 | +- Scan selected text (`>= 50 chars`) on `https://dosafe.io` and a third-party page. |
| 250 | +- Crawl to test ingestion endpoint and verify payload integrity. |
| 251 | + |
| 252 | +Technical: |
| 253 | + |
| 254 | +- `manifest.json` parses correctly. |
| 255 | +- `popup.js` passes syntax check. |
| 256 | +- CORS and endpoint auth validated on ingestion server. |
| 257 | + |
| 258 | +## 10) Claude-Style Tactics and Tools (Captured Notes) |
| 259 | + |
| 260 | +The following reflects the workflow style described by Claude-like browser agents and how to apply it in DOSafe extension development: |
| 261 | + |
| 262 | +- `find` tactic: |
| 263 | + - First locate target elements semantically (price, title, seller, rating) instead of fixed selectors. |
| 264 | + - In extension implementation, emulate with keyword scoring over DOM text blocks and selector fallbacks. |
| 265 | +- `read_page` tactic: |
| 266 | + - Read broad page structure first (headings, regions, forms, interactive elements), then extract targeted fields. |
| 267 | + - In extension implementation, add structured DOM traversal mode (TreeWalker + role/aria map). |
| 268 | +- `get_page_text` tactic: |
| 269 | + - Extract full text body for broad understanding; then do targeted extraction. |
| 270 | + - Already partially implemented via `document.body.innerText`. |
| 271 | +- `javascript_tool` tactic: |
| 272 | + - Execute in-page JS for high-fidelity extraction from dynamic apps (React/Vue state, globals, inline JSON). |
| 273 | + - In extension implementation, use `chrome.scripting.executeScript`. |
| 274 | +- `computer` interaction tactic: |
| 275 | + - Use click/scroll/type only when DOM extraction is insufficient. |
| 276 | + - For extension product, keep interaction optional and user-triggered for predictability. |
| 277 | +- `read_network_requests` tactic: |
| 278 | + - Observe XHR/fetch responses to get clean structured data directly from site APIs. |
| 279 | + - Recommended for Phase 3 with debugger/webRequest-backed mode. |
| 280 | +- `tabs` tactic: |
| 281 | + - Run extraction on multiple tabs in parallel for throughput. |
| 282 | + - Extension roadmap: queue jobs per tab and bounded concurrency. |
| 283 | +- `screenshot` tactic: |
| 284 | + - Capture evidence when text/DOM is incomplete or disputed. |
| 285 | + - Extension roadmap: `chrome.tabs.captureVisibleTab`. |
| 286 | + |
| 287 | +Execution heuristics from Claude-style operation: |
| 288 | + |
| 289 | +- Read first, act second: snapshot structure before interacting. |
| 290 | +- Prefer stable references (DOM path/semantic ref) over pixel coordinates. |
| 291 | +- Combine DOM signals + network signals for highest accuracy. |
| 292 | +- Run tasks in parallel where independent (multiple tabs/pages). |
| 293 | +- Keep extraction deterministic; use UI automation only as fallback. |
| 294 | + |
| 295 | +## 11) Additional Research-Backed Techniques for DOSafe Crawler |
| 296 | + |
| 297 | +These are practical techniques researched and recommended for robust production crawling via extension: |
| 298 | + |
| 299 | +- Multi-layer extraction pipeline: |
| 300 | + - Layer 1: selected text. |
| 301 | + - Layer 2: visible body text. |
| 302 | + - Layer 3: structured fields (title, meta, headings, price candidates). |
| 303 | + - Layer 4: API/network payload capture (optional advanced mode). |
| 304 | +- E-commerce specific parsing: |
| 305 | + - Normalize localized prices (`153.513₫`, `153,513 VND`, `$12.99`). |
| 306 | + - Extract currency, numeric value, and confidence per candidate. |
| 307 | +- Content deduplication: |
| 308 | + - Hash normalized text and key fields before upload. |
| 309 | + - Skip duplicate sends within a short time window per URL. |
| 310 | +- Quality scoring: |
| 311 | + - Attach `extraction_confidence` from field completeness + text length + source type. |
| 312 | +- Anti-fragile parsing: |
| 313 | + - Use multiple selectors and regex routes; avoid single brittle selectors. |
| 314 | + - Gracefully degrade to text-only payload when structure fails. |
| 315 | +- Privacy-safe default: |
| 316 | + - Redact obvious PII patterns before upload (email, phone, card-like numbers). |
| 317 | + - Add optional domain allowlist mode for enterprise deployments. |
| 318 | +- Reliability: |
| 319 | + - Add retry with exponential backoff for crawl uploads. |
| 320 | + - Add local queue when offline, flush later from background worker. |
| 321 | +- Observability: |
| 322 | + - Include `capture_id`, timing metrics, and parser version in payload. |
| 323 | + - Keep server-side audit trail for each crawl event. |
| 324 | + |
| 325 | +Recommended next implementation slice: |
| 326 | + |
| 327 | +- Add parser profiles: `generic`, `ecommerce`, `article`, `social`. |
| 328 | +- Add payload schema versioning (`schema_version: 1`). |
| 329 | +- Add authenticated ingestion (Bearer/HMAC) and replay protection. |
| 330 | + |
| 331 | +## 12) Facebook Profile Crawl Limits and Browser-Agent Direction |
| 332 | + |
| 333 | +Current verified behavior in pps/extension: |
| 334 | + |
| 335 | +- content-facebook.js binds actions only to visible Facebook rticle nodes on the current page. |
| 336 | +- Profile Check sends uthorName, profileUrl, and postText to the background worker. |
| 337 | +- ackground.js opens the main profileUrl in a new tab and extracts only: |
| 338 | + - document.title |
| 339 | + - first h1 |
| 340 | + - meta[name="description"] |
| 341 | + - current document.body.innerText |
| 342 | +- It does not automatically click into About, Friends, Photos, or Posts tabs. |
| 343 | +- It does not run a multi-step observe -> decide -> act loop. |
| 344 | + |
| 345 | +Implication: |
| 346 | + |
| 347 | +- Current profile scraping is a single-page capture, not a full browser agent. |
| 348 | +- Data that is hidden behind profile tabs, lazy-loading, or interaction gates is not collected. |
| 349 | +- This is acceptable for lightweight context capture, but not sufficient for robust scam-profile investigation. |
| 350 | + |
| 351 | +Recommended browser-agent pattern for DOSafe: |
| 352 | + |
| 353 | +- Use the model as planner only. In current stack, this means Qwen3.5 emits structured actions. |
| 354 | +- Use the extension runtime as executor via: |
| 355 | + - chrome.tabs |
| 356 | + - chrome.scripting.executeScript |
| 357 | + - chrome.storage |
| 358 | + - abs.onUpdated / wait logic |
| 359 | +- Keep a tight loop: |
| 360 | + 1. Observe current page state |
| 361 | + 2. Model returns next allowed action |
| 362 | + 3. Runtime executes action |
| 363 | + 4. Runtime returns compact observation |
| 364 | +- Restrict actions by whitelist and block destructive actions by default. |
| 365 | + |
| 366 | +Suggested next extension scope: |
| 367 | + |
| 368 | +- Add Facebook profile deep crawl sequence: |
| 369 | + - main profile |
| 370 | + - About |
| 371 | + - Posts |
| 372 | + - Photos |
| 373 | +- Add bounded step execution and timeout guards. |
| 374 | +- Return structured observations with: |
| 375 | + - current URL |
| 376 | + - visible tabs |
| 377 | + - extracted text summary |
| 378 | + - entities found (phones, links, ank_accounts, domains) |
0 commit comments