H-6519: Add discrete token attribute types#8764
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
1 Skipped Deployment
|
PR SummaryMedium Risk Overview Model & authoring: Colour elements and Zod/AI guidance now distinguish continuous Simulation runtime: Token objects are UI: Dimension type selectors, and initial-state/scenario spreadsheets that edit bool/UUID cells (not only numbers), feed the same typed marking shape used at run time. Regression tests cover scenario row coercion, initial marking pack/decode, and typed transition I/O. Reviewed by Cursor Bugbot for commit 27d3d08. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 27d3d08. Configure here.
| for (const [placeId, value] of Object.entries( | ||
| scenario.initialState.content, | ||
| )) { | ||
| // Colored places: number[][] stored directly by the UI. |
There was a problem hiding this comment.
Scenario compile throws on bad rows
Medium Severity
In per_place scenario initial state, colored token rows are converted via tokenRecordsFromRows, which calls coerceTokenRecord and can throw on invalid typed values (e.g. malformed UUIDs). Unlike expression and code-mode paths, this branch has no try/catch, so compileScenario may throw instead of returning structured compilation errors.
Reviewed by Cursor Bugbot for commit 27d3d08. Configure here.
| if (value === undefined || value === null || value === "") { | ||
| return NIL_UUID; | ||
| } | ||
| if (typeof value !== "string" || !UUID_RE.test(value)) { |
There was a problem hiding this comment.
Semgrep identified an issue in your code:
User-supplied UUID input is validated with a regex (UUID_RE.test(value)) that could be vulnerable to ReDoS attacks if the pattern uses inefficient backtracking constructs.
More details about this
The UUID_RE regex is being used to validate user-supplied input via the value parameter in the coerceUuid() function. If UUID_RE is not carefully constructed, an attacker could provide a malicious input string that causes the regex engine to hang or consume excessive CPU time (ReDoS - Regular Expression Denial of Service).
Attack scenario: An attacker could call this validation function with a specially crafted string (for example, a very long string of characters that almost—but not quite—match the UUID pattern, like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@" repeated many times). When UUID_RE.test(value) executes against this string, the regex backtracking could cause the application to freeze, making the service unavailable to legitimate users.
The vulnerability depends on the actual regex pattern in UUID_RE (not shown here), but common ReDoS patterns include nested quantifiers like (a+)+ or overlapping alternatives that force excessive backtracking on non-matching input.
To resolve this comment:
✨ Commit fix suggestion
| if (typeof value !== "string" || !UUID_RE.test(value)) { | |
| function isHexDigit(char: string): boolean { | |
| return ( | |
| (char >= "0" && char <= "9") || | |
| (char >= "a" && char <= "f") || | |
| (char >= "A" && char <= "F") | |
| ); | |
| } | |
| function validateUuid(value: string): true | string { | |
| if (value.length !== 36) { | |
| return "invalid length"; | |
| } | |
| for (let index = 0; index < value.length; index += 1) { | |
| const char = value[index]; | |
| if (index === 8 || index === 13 || index === 18 || index === 23) { | |
| if (char !== "-") { | |
| return "invalid hyphen placement"; | |
| } | |
| continue; | |
| } | |
| if (!isHexDigit(char)) { | |
| return "invalid character"; | |
| } | |
| } | |
| return true; | |
| } | |
| function coerceUuid(value: unknown, context: string): string { | |
| if (value === undefined || value === null || value === "") { | |
| return NIL_UUID; | |
| } | |
| if (typeof value !== "string") { | |
| throw new Error(`${context} must be a UUID string.`); | |
| } | |
| const normalized = value.trim(); | |
| const validation = validateUuid(normalized); | |
| if (validation !== true) { | |
| throw new Error(`${context} must be a UUID string.`); | |
| } | |
| return normalized.toLowerCase(); | |
| } |
View step-by-step instructions
-
Replace the regex-based UUID check with a dedicated validator function so request data is not matched against a potentially expensive regex.
-
Add a helper such as
validateUuid(value: string): true | stringthat validates UUIDs with simple fixed checks instead ofUUID_RE.test(...).
For example, checkvalue.length === 36, check hyphens at positions8,13,18, and23, and verify every other character is a hex digit with a small character check likechar >= "0" && char <= "9"orchar.toLowerCase() >= "a" && char.toLowerCase() <= "f". -
Update
coerceUuid()to call the new helper after the type check.
For example, change the condition fromtypeof value !== "string" || !UUID_RE.test(value)totypeof value !== "string"followed byconst validation = validateUuid(value); if (validation !== true) { throw new Error(\${context} must be a UUID string.`); }`. -
Keep the existing normalization step and return the lowercase value after validation with
return value.toLowerCase();.
This preserves the current behavior while removing the regex denial-of-service risk. -
If this code accepts user input with surrounding whitespace, trim before validating by using
const normalized = value.trim();and validatenormalizedinstead of the raw string. Then returnnormalized.toLowerCase()so valid values are stored consistently.
Alternatively, if the UUID must specifically be RFC 4122 v1-v5, add fixed character checks for the version and variant positions as part of validateUuid, such as restricting index 14 to 1-5 and index 19 to 8, 9, a, or b.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by regex_dos.
You can view more details about this finding in the Semgrep AppSec Platform.


🌟 What is the purpose of this PR?
Adds support for discrete token attribute types in Petrinaut, so coloured token dimensions can be represented as
integer,boolean, oruuidvalues in addition to continuousrealvalues.This updates the model schema, simulator runtime, code authoring surface, and editor UI so typed token values can be created, simulated, displayed, and passed into user-authored Petrinaut code consistently.
🔗 Related links
🔍 What does this change?
real,integer,boolean, anduuid.number | boolean | stringvalues throughout Petrinaut core APIs.@hashintel/petrinautand@hashintel/petrinaut-core.Pre-Merge Checklist 🚀
🚢 Has this modified a publishable library?
This PR:
📜 Does this require a change to the docs?
The changes in this PR:
🕸️ Does this require a change to the Turbo Graph?
The changes in this PR:
🛡 What tests cover this?
yarn workspace @hashintel/petrinaut-core test:unityarn workspace @hashintel/petrinaut test:unityarn exec turbo run lint:tsc --filter '@hashintel/petrinaut-core' --filter '@hashintel/petrinaut'yarn exec turbo run lint:eslint --filter '@hashintel/petrinaut-core' --filter '@hashintel/petrinaut'yarn exec oxfmt --check $(git diff --name-only --diff-filter=ACM) libs/@hashintel/petrinaut-core/src/simulation/engine/token-values.tsgit diff --check❓ How to test this?