Skip to content

#136 Add fraud-signal detector that flags coordinated betting from th…FIXED#257

Merged
greatest0fallt1me merged 1 commit into
Predictify-org:mainfrom
veloura-dev:#136-Add-fraud-signal-detector-that-flags-coordinated-betting-from-the-same-Stellar-address-graph-FIX
Jun 29, 2026
Merged

#136 Add fraud-signal detector that flags coordinated betting from th…FIXED#257
greatest0fallt1me merged 1 commit into
Predictify-org:mainfrom
veloura-dev:#136-Add-fraud-signal-detector-that-flags-coordinated-betting-from-the-same-Stellar-address-graph-FIX

Conversation

@veloura-dev

Copy link
Copy Markdown
Contributor

🔎 Findings (gaps discovered in the codebase)

Walked the veloura-dev/predictify-backend repository on the default branch and confirmed the GrantFox fraud-signal feature was entirely missing:

# Finding Evidence
1 No fraud_flags table existed anywhere src/db/schema.ts had 0 matches for fraud; drizzle/migrations/ had no 0011_* file (slot 0011 was free between 0010_markets_fts.sql and 0012_follows.sql)
2 No graph builder or clustering utility existed grep -ri "union.?find|graph" src/ returned nothing relevant
3 No background worker for periodic fraud scanning src/workers/ contained only backupVerifier, indexer, indexerGapScan, marketResolver, webhookWorker
4 No admin review endpoint for fraud flags src/routes/admin/ only had audit.ts and reconciliation.ts
5 predictions table lacked a "funding source" field — yet the issue calls out "shared funding sources" as a primary signal src/db/schema.ts predictions columns: id, marketId, userId, outcome, amount, txHash, status, result, createdAt — no funder column
6 No documentation for fraud detection in docs/ ls docs/ returned a small set, none related
7 No tests for any of the above (greenfield) ls tests/ | grep -i fraud returned nothing

Conclusion: this was a greenfield feature, not a regression — the issue was fully unimplemented.


🛠 Fix features delivered

1. Database layer — additive migration 0011_fraud_flags.sql

  • ALTER TABLE predictions ADD COLUMN IF NOT EXISTS funding_source text;nullable, no default → fully backwards compatible with every existing row and every other insert path in the codebase.
  • Partial index predictions_funding_source_idx WHERE funding_source IS NOT NULL → cheap lookup, skips legacy nulls.
  • New fraud_flags table with:
    • (cluster_key, user_id) UNIQUE index → re-runs of the worker are idempotent, never duplicate findings.
    • status CHECK constraint (open / dismissed / confirmed) → enforced reviewer state machine.
    • evidence jsonb → carries the full graph context per finding.
    • correlation_id text → end-to-end trace from request → log → DB row.
    • (status, created_at DESC) index → admin list is fast even at millions of rows.
  • All statements use IF NOT EXISTS → migration is re-runnable safely.

2. Graph builder (pure function — no I/O, fully unit-testable)

fraudService.ts :: buildGraph(rows) produces undirected edges between Stellar addresses for any of:

Edge reason Trigger Weight
SHARED_FUNDING_SOURCE Two addresses funded by the same wallet 5
SHARED_TX_HASH Two distinct addresses appearing on the same transaction (very anomalous) 8
REPEATED_PATTERN Same (marketId, outcome, amount) placed inside the same 5-minute bucket — sybil signal 3

Edges are deduplicated (reason::detail::a::b keyed) and self-loops are guarded.

3. Clustering — classic Union-Find (DSU)

  • UnionFind class with path compression + union-by-rank → O(α(n)) per op.
  • clusterize(graph) ignores singleton components (configurable MIN_CLUSTER_SIZE = 2).
  • Each cluster carries:
    • key — deterministic id (addrs.sort().join("|")) so the same cluster across runs maps to the same row.
    • members — sorted addresses.
    • edges — the supporting evidence.
    • score — sum of edge weights → admins can sort by severity.

4. Persistence — idempotent batch upsert

DrizzleFraudRepo.upsertFlags(rows) uses Drizzle's ON CONFLICT (cluster_key, user_id) DO UPDATE to:

  • Insert new findings.
  • Refresh reason, score, evidence, correlation_id, updated_at on existing ones.
  • Never clobber reviewer state (status, reviewed_by, reviewed_at).

5. Background worker — src/workers/fraudDetector.ts

  • runOnce(opts) — single scan; returns a structured RunScanResult summary; never throws (errors logged, next interval retries → keeps the in-process scheduler stable).
  • start(intervalMs = 15 min, opts) — periodic timer; immediate kick-off + setInterval; unref()-ed so it doesn't block shutdown.
  • stop() — clean teardown.
  • Generates a correlationId per run if none was supplied → tracing.
  • CLI entry point (require.main === module) for ad-hoc operator runs.

6. Admin review endpoint — src/routes/admin/fraud.ts

Verb + path Purpose
GET /api/admin/fraud/flags?status=open&limit=50 Paginated list of flags, filterable by status
POST /api/admin/fraud/scan Manual trigger of a scan (with optional lookbackMs / maxPredictions)

Security and hygiene features stacked on the router:

  • requireAdmin middleware — JWT with role: "admin"; 403 on any failure (no enumeration leak).
  • Per-token rate limit — 60 req/min (configurable), keyed on the Authorization header.
  • Zod validation at the boundarystatus enum, limit range 1–200, lookbackMs ≤ 7 days, maxPredictions ≤ 100 000, .strict() body rejects unknown keys.
  • Standardised error envelope matching the rest of the codebase: { error: { code, message, details?, requestId } }.
  • Request ID echoed in X-Request-Id response header → clients can correlate with server logs.

7. Observability

  • Structured logging (pino) at every public entry point: fraud_scan: start, fraud_scan: complete, fraud_detector: run complete, fraud_detector: run failed.
  • Correlation IDs flow request → service → repo → persisted row (fraud_flags.correlation_id). Verified in tests:
    expect(w.correlationId).toBe("cid-2");

8. Performance & safety guards

  • Default lookback 24 h, capped at 10 000 predictions per scan → bounded memory.
  • Repeated-pattern bucket of 5 minutes keeps the bucketing map small while still catching coordinated sybil bursts.
  • Partial index on funding_source keeps the lookup cheap when most rows have NULL.
  • Singletons excluded from clustering → zero false-positive flags for solo users.

9. Documentation — docs/fraud-signal.md

  • ASCII architecture diagram.
  • Per-edge-type explanation.
  • Schema overview.
  • Operational notes (correlation IDs, error semantics, tuning knobs).
  • Test-run instructions.

Every public symbol in fraudService.ts, fraudDetector.ts, and routes/admin/fraud.ts also has a JSDoc block explaining responsibility and invariants.

10. Tests — 42 cases across 3 suites, all passing

Suite Cases Focus
tests/fraudService.test.ts 25 UnionFind correctness, all 3 edge types, dedupe, self-loops, clustering, orchestration, input validation, clock injection
tests/fraudDetector.test.ts 6 Happy path, error swallowing, interval guard, start/stop, double-start protection
tests/adminFraud.test.ts 11 Auth 403 (missing/non-admin), validation 400 (enum/range/non-numeric/strict body/negative), happy 200 (list + scan + persistence)
Test Suites: 3 passed, 3 total
Tests:       42 passed, 42 total
Time:        9.006 s

🎯 Acceptance-criteria traceability

Issue requirement Implementation Test
Graph builds fraudService.ts :: buildGraph 11 tests
Clustering (union-find) works fraudService.ts :: UnionFind + clusterize 8 tests
Flags persisted with reason fraud_flags table + DrizzleFraudRepo.upsertFlags 2 tests
Admin review endpoint routes/admin/fraud.ts (GET /flags, POST /scan) 11 tests
Secure requireAdmin + rate-limit + Zod 5 tests
Tested ≥ 90 % on changed lines 42 tests, every public symbol & branch exercised
Input validation at boundary Zod schemas in router 5 tests
Standardised error envelope Matches existing { error: { code, message, details, requestId } } 5 tests
Structured logging with correlation IDs pino + correlationId threaded everywhere Verified in test assertions
Clear docs + inline comments docs/fraud-signal.md + JSDoc on every export

CLOSE #136

…etting from the same Stellar address graph FIXED
@drips-wave

drips-wave Bot commented Jun 28, 2026

Copy link
Copy Markdown

@veloura-dev Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

@veloura-dev

Copy link
Copy Markdown
Contributor Author

PLEASE REVIEW

@veloura-dev

Copy link
Copy Markdown
Contributor Author

@greatest0fallt1me please review

@greatest0fallt1me greatest0fallt1me merged commit d314c12 into Predictify-org:main Jun 29, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add fraud-signal detector that flags coordinated betting from the same Stellar address graph

2 participants