Skip to content

Fix Algorithm deepCopy subtype preservation#453

Draft
Abhi190702 wants to merge 1 commit into
cbomkit:mainfrom
Abhi190702:codex/fix-algorithm-deepcopy-subtypes
Draft

Fix Algorithm deepCopy subtype preservation#453
Abhi190702 wants to merge 1 commit into
cbomkit:mainfrom
Abhi190702:codex/fix-algorithm-deepcopy-subtypes

Conversation

@Abhi190702
Copy link
Copy Markdown

@Abhi190702 Abhi190702 commented May 24, 2026

Fixes #127

What changed

  • Added a shallow-copy hook to Algorithm.deepCopy() so recursive child copying stays centralized.
  • Added type-specific copy constructors and shallowCopy() overrides across Algorithm subclasses, including indirect subclasses such as Ascon128.
  • Updated RSA to use the shared Algorithm.deepCopy() path instead of duplicating the child-copy loop.
  • Added regression coverage for subtype preservation and enricher recasting behavior.

Why

Algorithm.deepCopy() previously instantiated new Algorithm(this), so calling deepCopy() on AES and other algorithm subclasses returned a plain Algorithm. That broke downstream instanceof logic, including enricher behavior and CBOM output decisions.

Validation

  • mvn -pl mapper,enricher spotless:apply
  • mvn -pl mapper,enricher -am -Dexec.skip=true -Dtest=AlgorithmDeepCopyTest,AESEnricherTest -Dsurefire.failIfNoSpecifiedTests=false test
  • mvn -pl mapper,enricher -am -Dexec.skip=true test

Note: -Dexec.skip=true is needed on Windows because mapper's compile phase otherwise tries to execute download-cipher-suites.sh directly and fails with %1 is not a valid Win32 application.

@Abhi190702 Abhi190702 changed the title [codex] Fix Algorithm deepCopy subtype preservation Fix Algorithm deepCopy subtype preservation May 24, 2026
Copy link
Copy Markdown
Contributor

@n1ckl0sk0rtge n1ckl0sk0rtge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Thanks for tackling #127 — the fix correctly preserves subtype identity, all 174 algorithm subclasses + 5 intermediate classes are covered, RSA is migrated to the shared path, and 3-level inheritance (Ascon128 → Ascon → Algorithm) works. No architectural, security, performance, or consumer-breaking concerns. Requesting changes on a couple of points before merge.

Coverage check — verified against the diff:

  • Architecture: clean, all changes confined to mapper model + 1 enricher test
  • All 155 transitive Algorithm subclasses + 23 indirect descendants are touched
  • Intermediate classes (Ascon, BLAKE, Elephant, Grain, IES, Isap, DSA, EllipticCurveAlgorithm, PBES1/2, PKCS12PBE) correctly use protected copy ctors
  • No scope creep; no behavior change for consumers besides the (intended) defensive children-map copy in the multi-arg "asKind" constructor

Must fix

1. DCO check failing — the commit needs a Signed-off-by trailer. Please amend with git commit --amend --signoff (or git rebase --signoff main) and force-push.


Should fix

2. AlgorithmDeepCopyTest — RSA and Ascon128 cases don't actually pin down the new behavior

Both tests only check isInstanceOf(...) + hasChildOfType(...).isPresent(). The AES case goes further and asserts .isNotSameAs(...) on the child KeyLength, which is what proves the recursive child.deepCopy() loop ran. RSA is the most interesting case here because it lost its custom deepCopy() override — that's exactly the regression case where you want a child-independence assertion. Same for Ascon128 (the 3-level case).

Suggested additions:

// In deepCopyPreservesExistingSpecializedSubclass()
assertThat(copy.hasChildOfType(KeyLength.class).get())
        .isNotSameAs(rsa.hasChildOfType(KeyLength.class).get());

// In deepCopyPreservesIndirectAlgorithmSubclass()
assertThat(copy.hasChildOfType(TagLength.class).get())
        .isNotSameAs(ascon128.hasChildOfType(TagLength.class).get());

3. No field-equality assertion on the copied metadata

Nothing in the new tests verifies that name, kind, detectionLocation, and especially origin are actually copied. The origin field matters because RSA's old copy path went through super(rsa.name, rsa.kind, rsa.detectionLocation) (the 3-arg ctor) — that path did NOT propagate origin. The new path uses super(rsa) (protected copy ctor) which does. This is a behavior improvement, but it's only locked in if a test asserts it. A couple of lines per test would do it.

4. shallowCopy() hook has a silent-failure footgun for future subclasses

mapper/src/main/java/com/ibm/mapper/model/Algorithm.java — the base shallowCopy() returns new Algorithm(this). Every current subclass overrides it correctly. But if a future subclass author forgets to override, deepCopy() silently returns a plain Algorithm — reintroducing #127 for that family with no compile error and no test failure unless someone remembers to add an assertion.

Two mitigations worth considering:

  • JavaDoc on shallowCopy() stating: "Subclasses MUST override this to return their concrete type; otherwise deepCopy() will lose subtype identity."
  • A single reflection-driven parametrized test that iterates all Algorithm subclasses on the classpath and asserts instance.deepCopy().getClass() == instance.getClass(). ~30 lines, covers every existing and future subclass.

Suggestions

5. Method naming — shallowCopy() is a bit misleading
It doesn't even copy children shallowly — it returns a same-type instance with an empty children map. newEmptyOfSameType() or copyMetadataOnly() would describe the contract more honestly. Not a blocker; a JavaDoc clarification is enough if renaming across 174 files is too invasive.

6. Mention RSA's origin-field fix in the PR description
The removal of RSA's custom deepCopy() override is a subtle behavior improvement (previously the origin field was dropped via the 3-arg super call). Worth a line in the PR body so downstream consumers aren't surprised.

7. Document the partial defensive-copy semantics
The multi-arg "asKind" ctor's new HashMap<>(algorithm.getChildren()) creates a new map but shares INode references. The AESEnricherTest exercises removal-by-key isolation, which works, but mutation of a child's nested state would still be visible on both sides. Worth a one-line comment next to the assignment: // Children INodes themselves are aliased; deep isolation requires deepCopy().


What I like

  • Linked-issue scope is exact; PR addresses every acceptance criterion in #127.
  • The shallowCopy() template-method approach eliminates the children-loop duplication that the issue's literal suggestion would have spread across 174 files. (Worth noting the alternative — overriding deepCopy() per subclass — would also fix the bug and is simpler conceptually; the DRY win here is real but the trade-off is the footgun in finding #4.)
  • Intermediate-class visibility (protected on Ascon/BLAKE/IES/etc., private on leaves) is correct throughout.
  • The AESEnricherTest addition validates the real-world enricher pipeline scenario the issue reporter called out.

Signed-off-by: ABHIJEET RANJAN <abhijeet.r1907@gmail.com>
@Abhi190702 Abhi190702 force-pushed the codex/fix-algorithm-deepcopy-subtypes branch from 77c5695 to eeceece Compare May 29, 2026 17:32
@Abhi190702
Copy link
Copy Markdown
Author

Thanks for the detailed review!

I addressed the requested follow-ups:

Strengthened the RSA and Ascon128 deepCopy regression tests to assert that copied child nodes are separate instances from the originals.
Added metadata preservation assertions for name, kind, detection location, and origin.
Added JavaDoc documenting the shallowCopy() contract for future subclasses.
Added a comment clarifying that the recast constructor performs a map-level defensive copy while child INode instances remain shared.
I also amended the commit with --signoff to fix the DCO failure.

Validation:

mvn -pl mapper,enricher spotless:apply
mvn -pl mapper,enricher -am -Dexec.skip=true -Dtest=AlgorithmDeepCopyTest,AESEnricherTest -Dsurefire.failIfNoSpecifiedTests=false test
mvn -pl mapper,enricher -am -Dexec.skip=true test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

deepCopy does not create an instance of the specific class

2 participants