PDF stage 4: Type3 glyphs + non-embedded fonts (deferred from stage 3)#553
Draft
andiwand wants to merge 7 commits into
Draft
PDF stage 4: Type3 glyphs + non-embedded fonts (deferred from stage 3)#553andiwand wants to merge 7 commits into
andiwand wants to merge 7 commits into
Conversation
289f85a to
dccb1d9
Compare
b548931 to
5c20c27
Compare
dccb1d9 to
424f31f
Compare
5c20c27 to
c9e95c6
Compare
424f31f to
75d240f
Compare
4b596bf to
e230889
Compare
f9351ef to
29cdc2f
Compare
e230889 to
5f7e680
Compare
Seed the stage-3.5 branch. Read a Type1 program (eexec + charstring decryption), translate Type1 -> Type2 charstrings, build a CFF and reuse 3.4's CFF -> OTF path; reverse map via glyph names -> AGL. Stacked on 3.4. Implementation follows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
First self-contained piece of 3.5: the Type1 running-key cipher (font::type1::decrypt) and its two entry points — decrypt_eexec (key 55665, 4-byte skip, binary or ASCII-hex/PFA auto-detected) and decrypt_charstring (key 4330, /lenIV-aware). These don't depend on the CFF translation work, so they land ahead of the full Type1Font reader (eexec parse + Type1->Type2 charstring translation -> reuse 3.4's CFF->OTF path). Tests: round-trips against an independent forward-cipher reference (so they're not circular), the lenIV override, and the hex eexec form. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
Parse an Adobe Type1 font program into its decrypted parts: split the clear-text header / eexec section / trailer, read /FontName, /FontMatrix, /FontBBox and /Encoding (StandardEncoding or a custom dup-code-name-put array) from the header, decrypt the eexec section (type1_crypt) and extract every glyph's decrypted charstring plus /Subrs (RD/-| binary entries, /lenIV-aware). PFB segment framing is stripped if present. Charstrings are not yet interpreted — that's the Type1->Type2 translation that follows, feeding 3.4's CFF->OTF path. Tests: a hand-built encrypted Type1 program (independent forward cipher) — magic, header/encoding parse, and the decrypted charstrings/subrs. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
cff::build_cff serializes a name-keyed CFF from a list of (name, Type2 charstring) glyphs + default/nominalWidthX + bbox: Header, Name INDEX, Top DICT (FontBBox + charset/CharStrings/Private offsets, fixed-width so the layout resolves in one pass), String INDEX (every glyph name as a custom SID, so no standard-strings table is needed), empty Global Subr INDEX, CharStrings INDEX, format-0 charset, Private DICT. This is the assembly target for the Type1 -> CFF path: the translated Type2 charstrings land here, the result feeds CffFont + wrap_to_otf (3.4). Test: build a 2-glyph CFF, read it back through CffFont (name, glyph name, bbox, charstring width vs. default) and confirm it wraps to a loadable OTTO. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
type1::to_type2 translates a decrypted Type1 charstring to Type2 (CFF): a stack machine that flattens callsubr (inlining the font's /Subrs, depth guarded), folds div, lifts the hsbw side bearing into the first moveto and returns the advance width separately, drops Type1-only hints (dotsection, *stem3, hint-replacement OtherSubr 3), and translates the flex OtherSubrs (1/2/0 -> two rrcurvetos) and seac (-> Type2 endchar form). Path operators (r/h/v lineto, rr/vh/hv curveto, stems, moves, endchar) share opcodes with Type2 and pass through. Best-effort / display-oriented: hints affect rendering quality, not glyph shape. Tests: exact Type2 output for hsbw width + side-bearing folding into the first move, callsubr inlining, and div folding. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
type1::to_cff translates every glyph (to_type2, flattening /Subrs), places .notdef at glyph 0 (synthesizing one when absent) and assembles a CFF via the builder. load_embedded_font now reads /FontFile: parse the Type1 program, convert to CFF, and hold it as a CffFont — so embedded Type1 reuses the entire 3.4 CFF path (PUA re-encode, @font-face wrap, reverse map) with no new abstract::Font subclass. Simple-font glyph selection by PostScript name (PDF /Encoding -> name -> glyph) is the shared CFF/Type1 follow-up tied to the AGL/name-mapping decision; composite and the wrap/display path work today. Tests: a Type1 program converts to a CFF that reads back through CffFont (glyph count incl. synthesized .notdef, names) and wraps to a loadable OTTO. Full font + PDF + HTML corpus green (460 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
Seed the stage-3.6 branch. Type3 char procs -> SVG via a minimal path->SVG capability pulled forward from stage 4; non-embedded standard-14 substitution + AFM widths (closes stage 2's deferred item). Stacked on 3.5. Implementation follows. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_014hm5SrdJvGNJNEHxpxR1dz
5f7e680 to
d5ed651
Compare
29cdc2f to
5908b96
Compare
5908b96 to
fac6b57
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Type3 glyphs + non-embedded fonts — deferred from stage 3 to stage 4, where the path → SVG machinery they need naturally lives. Stacked on the 3.5 Type1 PR.
docs/design/pdf/stage-3.6-type3-nonembedded.md(filename keeps its original3.6slug); implementation follows once stage 4 (graphics / SVG) is underway.Plan
/Widths, AFM widths for the standard 14 (closes stage 2's deferred item) as a generated data table.With this deferred, stage 3 completes at 3.5 (Type1). See the roadmap in
src/odr/internal/pdf/AGENTS.md.🤖 Generated with Claude Code