Skip to content

fix(t5gemma): add missing f-prefix in _normalize_token error message#697

Open
Osamaali313 wants to merge 1 commit into
google-deepmind:mainfrom
Osamaali313:fix/t5gemma-token-error-fstring
Open

fix(t5gemma): add missing f-prefix in _normalize_token error message#697
Osamaali313 wants to merge 1 commit into
google-deepmind:mainfrom
Osamaali313:fix/t5gemma-token-error-fstring

Conversation

@Osamaali313

Copy link
Copy Markdown

Summary

_normalize_token() in gemma/research/t5gemma/sampling.py raises a ValueError whose message uses {token!r} inside a plain string literal (no f prefix), so users see the placeholder printed literally instead of the offending token:

ValueError: Invalid forbidden token: {token!r}. Forbidden tokens must map to single token ids in the vocab.

This makes forbidden_tokens misconfigurations hard to debug. Adding the f prefix interpolates the token as intended:

ValueError: Invalid forbidden token: 'hello world'. Forbidden tokens must map to single token ids in the vocab.

Context

This is the same bug class as #658 (gemma/gm/text/_sampler.py). A repo-wide sweep for non-f-string error messages containing {...} placeholders found exactly two occurrences — the one reported in #658, and this sibling in research/t5gemma/sampling.py which that report missed. This PR fixes the t5gemma occurrence.

Change

One line: add the f prefix to the string literal that contains {token!r} (the adjacent continuation line has no placeholder and is left unchanged).

@google-cla

google-cla Bot commented Jun 14, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@Osamaali313

Copy link
Copy Markdown
Author

@googlebot I signed it!

@Osamaali313 Osamaali313 reopened this Jun 14, 2026
The ValueError raised in t5gemma's _normalize_token used '{token!r}' inside a
plain string literal (no f-prefix), so the message printed the literal text
"{token!r}" instead of the offending token, making forbidden-token
misconfigurations hard to debug. Add the f-prefix so the token is interpolated.

Same bug class as google-deepmind#658 (gemma/gm/text/_sampler.py); this is the sibling
occurrence in gemma/research/t5gemma/sampling.py.
@Osamaali313 Osamaali313 force-pushed the fix/t5gemma-token-error-fstring branch from f42e589 to f94d016 Compare June 14, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant