Skip to content

Validate code point range in NumericEntityUnescaper.translate()#747

Merged
garydgregory merged 2 commits into
apache:masterfrom
dxbjavid:numeric-entity-codepoint-range
Jun 2, 2026
Merged

Validate code point range in NumericEntityUnescaper.translate()#747
garydgregory merged 2 commits into
apache:masterfrom
dxbjavid:numeric-entity-codepoint-range

Conversation

@dxbjavid

@dxbjavid dxbjavid commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Noticed translate calls Character.toChars(entityValue) for values above 0xFFFF without checking the upper code point bound, so a numeric reference like � (or �) throws IllegalArgumentException instead of being left alone. unescapeHtml4 and unescapeXml run this over untrusted markup, where every other malformed entity is silently ignored. Reject out-of-range code points the same way, by returning 0.

@garydgregory

Copy link
Copy Markdown
Member

@vid
Again, your PR lacks a unit test. Let's save a round of interaction and do make sure you provide one on the first go around.

@dxbjavid

dxbjavid commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

Added testOutOfRangeCodePoint to NumericEntityUnescaperTest covering �, � and �. They're now left untranslated instead of throwing. Pushed.

@garydgregory garydgregory changed the title validate code point range in NumericEntityUnescaper.translate Validate code point range in NumericEntityUnescaper.translate Jun 2, 2026
@garydgregory garydgregory merged commit 238c88d into apache:master Jun 2, 2026
14 of 15 checks passed
@garydgregory garydgregory changed the title Validate code point range in NumericEntityUnescaper.translate Validate code point range in NumericEntityUnescaper.translate() Jun 2, 2026
@dxbjavid

dxbjavid commented Jun 9, 2026

Copy link
Copy Markdown
Contributor Author

Thanks for the review and merge! Appreciated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants