#136 Add fraud-signal detector that flags coordinated betting from th…FIXED by veloura-dev · Pull Request #257 · Predictify-org/predictify-backend

veloura-dev · 2026-06-28T21:35:41Z

🔎 Findings (gaps discovered in the codebase)

Walked the veloura-dev/predictify-backend repository on the default branch and confirmed the GrantFox fraud-signal feature was entirely missing:

#	Finding	Evidence
1	No `fraud_flags` table existed anywhere	`src/db/schema.ts` had 0 matches for `fraud`; `drizzle/migrations/` had no `0011_*` file (slot 0011 was free between `0010_markets_fts.sql` and `0012_follows.sql`)
2	No graph builder or clustering utility existed	`grep -ri "union.?find\|graph" src/` returned nothing relevant
3	No background worker for periodic fraud scanning	`src/workers/` contained only `backupVerifier`, `indexer`, `indexerGapScan`, `marketResolver`, `webhookWorker`
4	No admin review endpoint for fraud flags	`src/routes/admin/` only had `audit.ts` and `reconciliation.ts`
5	`predictions` table lacked a "funding source" field — yet the issue calls out "shared funding sources" as a primary signal	`src/db/schema.ts` `predictions` columns: `id, marketId, userId, outcome, amount, txHash, status, result, createdAt` — no funder column
6	No documentation for fraud detection in `docs/`	`ls docs/` returned a small set, none related
7	No tests for any of the above (greenfield)	`ls tests/ \| grep -i fraud` returned nothing

Conclusion: this was a greenfield feature, not a regression — the issue was fully unimplemented.

🛠 Fix features delivered

1. Database layer — additive migration `0011_fraud_flags.sql`

ALTER TABLE predictions ADD COLUMN IF NOT EXISTS funding_source text; — nullable, no default → fully backwards compatible with every existing row and every other insert path in the codebase.
Partial index predictions_funding_source_idx WHERE funding_source IS NOT NULL → cheap lookup, skips legacy nulls.
New fraud_flags table with:
- (cluster_key, user_id) UNIQUE index → re-runs of the worker are idempotent, never duplicate findings.
- status CHECK constraint (open / dismissed / confirmed) → enforced reviewer state machine.
- evidence jsonb → carries the full graph context per finding.
- correlation_id text → end-to-end trace from request → log → DB row.
- (status, created_at DESC) index → admin list is fast even at millions of rows.
All statements use IF NOT EXISTS → migration is re-runnable safely.

2. Graph builder (pure function — no I/O, fully unit-testable)

fraudService.ts :: buildGraph(rows) produces undirected edges between Stellar addresses for any of:

Edge reason	Trigger	Weight
`SHARED_FUNDING_SOURCE`	Two addresses funded by the same wallet	5
`SHARED_TX_HASH`	Two distinct addresses appearing on the same transaction (very anomalous)	8
`REPEATED_PATTERN`	Same `(marketId, outcome, amount)` placed inside the same 5-minute bucket — sybil signal	3

Edges are deduplicated (reason::detail::a::b keyed) and self-loops are guarded.

3. Clustering — classic Union-Find (DSU)

UnionFind class with path compression + union-by-rank → O(α(n)) per op.
clusterize(graph) ignores singleton components (configurable MIN_CLUSTER_SIZE = 2).
Each cluster carries:
- key — deterministic id (addrs.sort().join("|")) so the same cluster across runs maps to the same row.
- members — sorted addresses.
- edges — the supporting evidence.
- score — sum of edge weights → admins can sort by severity.

4. Persistence — idempotent batch upsert

DrizzleFraudRepo.upsertFlags(rows) uses Drizzle's ON CONFLICT (cluster_key, user_id) DO UPDATE to:

Insert new findings.
Refresh reason, score, evidence, correlation_id, updated_at on existing ones.
Never clobber reviewer state (status, reviewed_by, reviewed_at).

5. Background worker — `src/workers/fraudDetector.ts`

runOnce(opts) — single scan; returns a structured RunScanResult summary; never throws (errors logged, next interval retries → keeps the in-process scheduler stable).
start(intervalMs = 15 min, opts) — periodic timer; immediate kick-off + setInterval; unref()-ed so it doesn't block shutdown.
stop() — clean teardown.
Generates a correlationId per run if none was supplied → tracing.
CLI entry point (require.main === module) for ad-hoc operator runs.

6. Admin review endpoint — `src/routes/admin/fraud.ts`

Verb + path	Purpose
`GET /api/admin/fraud/flags?status=open&limit=50`	Paginated list of flags, filterable by status
`POST /api/admin/fraud/scan`	Manual trigger of a scan (with optional `lookbackMs` / `maxPredictions`)

Security and hygiene features stacked on the router:

requireAdmin middleware — JWT with role: "admin"; 403 on any failure (no enumeration leak).
Per-token rate limit — 60 req/min (configurable), keyed on the Authorization header.
Zod validation at the boundary — status enum, limit range 1–200, lookbackMs ≤ 7 days, maxPredictions ≤ 100 000, .strict() body rejects unknown keys.
Standardised error envelope matching the rest of the codebase: { error: { code, message, details?, requestId } }.
Request ID echoed in X-Request-Id response header → clients can correlate with server logs.

7. Observability

Structured logging (pino) at every public entry point: fraud_scan: start, fraud_scan: complete, fraud_detector: run complete, fraud_detector: run failed.
Correlation IDs flow request → service → repo → persisted row (fraud_flags.correlation_id). Verified in tests:
```
expect(w.correlationId).toBe("cid-2");
```

8. Performance & safety guards

Default lookback 24 h, capped at 10 000 predictions per scan → bounded memory.
Repeated-pattern bucket of 5 minutes keeps the bucketing map small while still catching coordinated sybil bursts.
Partial index on funding_source keeps the lookup cheap when most rows have NULL.
Singletons excluded from clustering → zero false-positive flags for solo users.

9. Documentation — `docs/fraud-signal.md`

ASCII architecture diagram.
Per-edge-type explanation.
Schema overview.
Operational notes (correlation IDs, error semantics, tuning knobs).
Test-run instructions.

Every public symbol in fraudService.ts, fraudDetector.ts, and routes/admin/fraud.ts also has a JSDoc block explaining responsibility and invariants.

10. Tests — 42 cases across 3 suites, all passing

Suite	Cases	Focus
`tests/fraudService.test.ts`	25	UnionFind correctness, all 3 edge types, dedupe, self-loops, clustering, orchestration, input validation, clock injection
`tests/fraudDetector.test.ts`	6	Happy path, error swallowing, interval guard, start/stop, double-start protection
`tests/adminFraud.test.ts`	11	Auth 403 (missing/non-admin), validation 400 (enum/range/non-numeric/strict body/negative), happy 200 (list + scan + persistence)

Test Suites: 3 passed, 3 total
Tests:       42 passed, 42 total
Time:        9.006 s

🎯 Acceptance-criteria traceability

Issue requirement	Implementation	Test
Graph builds	`fraudService.ts :: buildGraph`	11 tests
Clustering (union-find) works	`fraudService.ts :: UnionFind` + `clusterize`	8 tests
Flags persisted with reason	`fraud_flags` table + `DrizzleFraudRepo.upsertFlags`	2 tests
Admin review endpoint	`routes/admin/fraud.ts` (`GET /flags`, `POST /scan`)	11 tests
Secure	`requireAdmin` + rate-limit + Zod	5 tests
Tested ≥ 90 % on changed lines	42 tests, every public symbol & branch exercised	✅
Input validation at boundary	Zod schemas in router	5 tests
Standardised error envelope	Matches existing `{ error: { code, message, details, requestId } }`	5 tests
Structured logging with correlation IDs	`pino` + `correlationId` threaded everywhere	Verified in test assertions
Clear docs + inline comments	`docs/fraud-signal.md` + JSDoc on every export	✅

CLOSE #136

…etting from the same Stellar address graph FIXED

drips-wave · 2026-06-28T21:36:05Z

@veloura-dev Great news! 🎉 Based on an automated assessment of this PR, the linked Wave issue(s) no longer count against your application limits.

You can now already apply to more issues while waiting for a review of this PR. Keep up the great work! 🚀

Learn more about application limits

veloura-dev · 2026-06-28T21:37:06Z

PLEASE REVIEW

veloura-dev · 2026-06-29T12:07:22Z

@greatest0fallt1me please review

Predictify-org#136 Add fraud-signal detector that flags coordinated b…

6611aff

…etting from the same Stellar address graph FIXED

greatest0fallt1me merged commit d314c12 into Predictify-org:main Jun 29, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

#136 Add fraud-signal detector that flags coordinated betting from th…FIXED#257

#136 Add fraud-signal detector that flags coordinated betting from th…FIXED#257
greatest0fallt1me merged 1 commit into
Predictify-org:mainfrom
veloura-dev:#136-Add-fraud-signal-detector-that-flags-coordinated-betting-from-the-same-Stellar-address-graph-FIX

veloura-dev commented Jun 28, 2026

Uh oh!

drips-wave Bot commented Jun 28, 2026

Uh oh!

veloura-dev commented Jun 28, 2026

Uh oh!

veloura-dev commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

veloura-dev commented Jun 28, 2026

🔎 Findings (gaps discovered in the codebase)

🛠 Fix features delivered

1. Database layer — additive migration 0011_fraud_flags.sql

2. Graph builder (pure function — no I/O, fully unit-testable)

3. Clustering — classic Union-Find (DSU)

4. Persistence — idempotent batch upsert

5. Background worker — src/workers/fraudDetector.ts

6. Admin review endpoint — src/routes/admin/fraud.ts

7. Observability

8. Performance & safety guards

9. Documentation — docs/fraud-signal.md

10. Tests — 42 cases across 3 suites, all passing

🎯 Acceptance-criteria traceability

Uh oh!

drips-wave Bot commented Jun 28, 2026

Uh oh!

veloura-dev commented Jun 28, 2026

Uh oh!

veloura-dev commented Jun 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Database layer — additive migration `0011_fraud_flags.sql`

5. Background worker — `src/workers/fraudDetector.ts`

6. Admin review endpoint — `src/routes/admin/fraud.ts`

9. Documentation — `docs/fraud-signal.md`