2026-06-29 build — Unity 6 (6000.0.60f1) — 64,773 classes, 569,859 methods, 188,384 fields GameAssembly.dll (222 MB) | IL2CPP | Unity 6 6000.0.60f1 | Beebyte Obfuscation 🏁 Project complete / archived (2026-07-01) — no longer maintained. Final state frozen below. See
BASELINE.md.
| Metric | Count | Coverage |
|---|---|---|
| Classes (semantic) | 5,646 / 9,928 obfuscated | 56.9% semantic class names |
| Methods (semantic) | 533,436 / 569,859 | 93.5% semantic |
| Methods (hash remaining) | 36,423 | 6.5% fallback (m_XXX) |
| Fields (semantic) | 160,256 / 188,384 | 85.1% semantic |
| cross_version entries | 40,223 | reusable across builds |
| Pipeline runtime | ~30s full run | — |
Final result — structural ceiling reached (2026-07-01). All three axes are at their structural limits: methods 93.5% and fields 85.1% are effectively saturated; classes plateau at 56.9%. This is not effort-limited — it is a proven limit. Four independent type-signal approaches (field-type, parent, interface, method-return-type) each failed the decisive gate2 test (~9% recovery accuracy, far below the 15% acceptance line) because a class is a HAS-A/role over its field types, so type signals reveal what it holds, not its role identity. Metadata decryption was also confirmed a dead end: Beebyte destroys the original class names at compile time (they are stored as
ÌÍÎÏgarbage in the metadata itself), so decrypting the structure tables yields no new real names. The remaining unexhausted signal is runtime instance values (strings/JSON keys), which static analysis cannot capture; it was scoped but not pursued to completion. Every assigned name here is an evidence-backed inference, not a recovered original.
Canonical numbers live in output/coverage_stats.json (regenerated every pipeline run via tools/compute_final_stats.py — the single authoritative criterion, which delegates to tools/name_quality.py). The 1,212 field-signature class names (workflow + A1 parallel-agent passes) are re-applied reproducibly by pipeline stage 2d (tools/apply_class_names.py, idempotent), so a rerun never drops them.
Coverage criterion fix (2026-06). The previously reported 62.7% class coverage was inflated: the official weak-name test missed a whole class of structural placeholders (
BaseClass290ImplImpl_31B9,BackingFieldBase_16D7— names synthesized from class topology with no semantics) and counted them as semantic. The corrected criterion (strip synthetic tokens; if nothing meaningful remains, it's fallback) lives intools/name_quality.py, shared by the pipeline, stats, apply, and grader so they can never drift. The corrected baseline is 45.4%; under the unified criterion the real field-signature names now correctly override placeholders and 8 evidence-synthesized names are added, reaching 46.7% (4,641/9,928). Each obfuscated class also carries a deterministic evidence grade (A/B/C/D, seetools/grade_evidence.py) so every assigned name is auditable. Note: obfuscated real names were destroyed at compile time and are unrecoverable — every name here is an evidence-backed inference, not a recovered original.
Why Unity 6 was a full re-crack. VRChat upgraded from Unity 2022 to Unity 6, which reshuffled the entire IL2CPP runtime layout — the old extractor failed completely. Metadata is encrypted and export symbols are stripped, so static tools (Il2CppDumper) do not work. The extractor uses reverse MethodInfo enumeration: scan all MethodInfo in the heap, resolve their klass, rebuild the type tree. Verified against ground truth (Vector3=x/y/z, Color=r/g/b/a, Transform=157 methods).
| Field | June-13 (Unity 2022) | Unity 6 |
|---|---|---|
| MethodInfo.name | 0x10 | 0x10 |
| MethodInfo → klass | 0x18 | 0x20 |
| klass.name | 0xA8 | 0x98 |
| klass.namespace | 0x18 | 0x18 |
| klass.parent | varies | 0xA0 (runtime consensus) |
| klass.fields | — | 0xA8 |
| FieldInfo stride | 0x30 | 0x20 |
| FieldInfo.name | 0x10 | 0x08 |
The extractor (tools/extract_reverse_unity6.py) is self-checking: ASLR heap-band auto-detect, Transform offset self-verify, parent-offset runtime consensus. If a future build shifts the layout, it errors loudly instead of emitting garbage.
# Full deobfuscation pipeline (5 stages)
python tools/run_full_pipeline.py
# Skip heavy binary analysis
python tools/run_full_pipeline.py --skip-binary
# Quick vocabulary merge + source tree rebuild
python tools/quick_update.py
# Runtime field extraction (requires VRChat offline)
start "" "VRChat.exe" --no-vr
python tools/extract_field_types_v2.pyprecise_dump.json (IL2CPP struct extraction from memory dump)
│
▼
run_full_pipeline.py (orchestrator)
├── Stage 0: Merge all name sources → unified_vocabulary.json (44,309 names)
│ Sources: community deob maps + mod mining + SDK + IDA xrefs + cross-version + metadata
│
├── Stage 1: deobfuscate.py (11-phase rename engine)
│ lifted → compiler artifacts → community → semantic → property → Unity →
│ inheritance → cross-ref → shared-method → binary-string → fallback
│ Result: 8,434 classes + 108,480 method renames applied
│
├── Stage 2: Cross-reference (Photon, SDK, structural, community)
│ high-confidence overrides on weak/fallback names
│
├── Stage 3: Generate outputs
│ deobfuscated_dump.json/cs (RVA), name_mapping.json, src/ tree (1,538 files),
│ coverage_stats.json (canonical numbers)
│
└── Stage 4: Generate IDA rename script (226,911 function renames)
├── tools/ 188 scripts (170 Python + 18 JavaScript)
│ ├── Core Pipeline run_full_pipeline.py, deobfuscate.py, quick_update.py
│ ├── Extraction extract_precise_dump.py, reverse_struct_layout.py
│ ├── Cross-version lift_*.py (body-hash, vtable, typedef-token lifts)
│ ├── LLM naming codex_worker.py, build_audit_batches.py, apply_audit_results.py
│ ├── Runtime/Frida bridge.py/js, vrc_frida_lib.js, extract_field_types_v2.py
│ ├── Auth/Tracing trace_auth_flow.js, hook_eos_anticheat.js
│ └── Patching patch_ga_binary.py, deploy_to_steam.py
│
├── output/ Final products
│ ├── src/ 1,538 deobfuscated C# source files (RVA-annotated)
│ │ ├── VRC/ VRChat game code (397 files)
│ │ ├── ThirdParty/ Libraries: Photon, BestHTTP, etc (956 files)
│ │ └── Global/ Global namespace (182 files)
│ ├── coverage_stats.json Canonical coverage numbers (regenerated per run)
│ ├── *.json Mappings, vocabulary, analysis results
│ └── *.md Coverage report, protocol analysis, EAC analysis
│
├── data/ Intermediate analysis data
├── ida/ IDA Pro database + scripts (excluded from git)
├── docs/ GitHub Pages dashboard
├── dumps/ Memory dumps (excluded from git, 7.4GB)
├── external/ 36+ cloned repos (excluded from git, 4.9GB)
├── metadata/ Patched global-metadata.dat (excluded from git)
└── archive/ 80 historical scripts from 5 dev phases (excluded)
Beebyte Obfuscator renames identifiers to ÌÍÎÏ strings (U+00CC-00CF) and shuffles
the Il2CppClass/FieldInfo/MethodInfo field layout every release (see the Unity 6
offset table above vs the prior Unity 2022 build). Key invariants:
| Property | Value |
|---|---|
| Obfuscated identifier regex | ^[Ì-Ï]{3,}$ |
| IL2CPP exports | stripped — static dumpers do not work |
| Struct layout | re-discovered per build via tools/extract_reverse_unity6.py (self-checking) |
global-metadata.dat |
encrypted (see Metadata Decryption below) |
VRChat uses Photon Realtime with FlatBuffer serialization:
Application VRCPlayer / NetworkManager / UdonBehaviour
Serialization FlatBufferSerializerCodec (8-bit + 32-bit)
Event Layer VRCPhotonEvent / IFlatBufferNetworkSerializer
Photon PhotonPeer → EnetPeer (UDP) / TPeer (TCP) / WebSocket
Encryption PhotonEncryptorPlugin (native DLL)
Key findings:
- 15 custom event types documented (Voice, Serialization, Moderation, etc.)
- 4-token auth chain: Steam → VRChat API → Photon → EAC
- Server-side EAC validation gates room joins via AuthCookie in Photon plugin
- FlatBuffers used for both 8-bit (frequent) and 32-bit (full precision) serialization
EAC (EOS Anti-Cheat) runs in Client-Server mode with continuous opaque message exchange:
- Bypass mode: EAC not initialized → no integrity messages → server rejects room joins
- Normal mode: EAC kernel driver blocks Frida/injection
- Recommended: Hybrid workflow — offline+Frida for analysis, MelonLoader+EAC for online
See EAC Auth Analysis and Photon Protocol Analysis for details.
- EAC blocks online analysis — always use offline VRChat (
VRChat.exe --no-vr) - ASLR — GameAssembly base changes every launch, hardcoded addresses need updating
- Never blindly call unknown IL2CPP exports — crashes Frida/VRChat
- Bridge trampoline (bridge.js) writes shellcode in GA .data section for anti-tamper
- All Python scripts use
sys.stdout.reconfigure(encoding='utf-8')for Windows CJK
The pipeline generates output/ida_apply_names.py with 226K+ function renames.
# In IDA: File -> Script File -> output/ida_apply_names.py
# The script auto-detects IDA's imagebase via idaapi.get_imagebase()
# No manual base address configuration neededFor Ghidra or other tools, use output/name_mapping.json:
{
"methods": { "OriginalObfClass::OrigObfMethod": "SemanticName", ... },
"classes": { "ÌÍÎÏÍÌÎ...": "VRCPlayer", ... }
}output/deobfuscated_dump.cs uses RVA (Relative Virtual Address) for method offsets, similar to Il2CppDumper output:
public class VRCPlayer : VRCPlayerApi
{
public Transform _avatar; // 0x48
void Awake(); // RVA: 0x1A2B3C0
void OnPhotonSerializeView(); // RVA: 0x1A2B520
}To use RVAs in IDA/Ghidra: imagebase + RVA = actual address.
IDA's default imagebase for PE files is 0x180000000. The runtime GA base varies per launch due to ASLR.
For richer output with field types, use the source tree (output/src/) which includes resolved types and offsets from field_types.json when available.
VRChat encrypts global-metadata.dat with Beebyte's custom XOR scheme. Use tools/decrypt_metadata.py:
python tools/decrypt_metadata.py <path_to_global-metadata.dat> <output_path>Algorithm (reverse-engineered from sub_180A7E880 in GameAssembly.dll):
- Header (first 0x148 bytes): XOR with
key[i] = (i - 0x34) & 0xFF - Sections: 7 sections XOR-decoded with position-dependent keys derived from header size fields
The decrypted metadata enables tools/lift_typedef_tokens.py to recover real class/method names from TypeDefinition tokens.
Note: The encryption constants may change with new VRChat builds. If decryption produces invalid output, re-analyze the decrypt function in GameAssembly.dll (search for the metadata magic 0xFAB11BAF handler).
| Document | Description |
|---|---|
| Workflow Guide | Complete pipeline guide for new contributors |
| Dashboard | Interactive visual overview (GitHub Pages) |
| Coverage Report | Current pipeline coverage metrics |
| Network Analysis | Photon network layer mapping |
| Photon Protocol | Protocol reverse engineering |
| EAC Auth Analysis | EOS anti-cheat authentication |
Private research project. Not for redistribution.