Skip to content

H-6519: Add discrete token attribute types#8764

Open
kube wants to merge 2 commits into
mainfrom
cf/h-6519-add-uuid-boolean-and-int-discrete-types
Open

H-6519: Add discrete token attribute types#8764
kube wants to merge 2 commits into
mainfrom
cf/h-6519-add-uuid-boolean-and-int-discrete-types

Conversation

@kube
Copy link
Copy Markdown
Collaborator

@kube kube commented May 26, 2026

🌟 What is the purpose of this PR?

Adds support for discrete token attribute types in Petrinaut, so coloured token dimensions can be represented as integer, boolean, or uuid values in addition to continuous real values.

This updates the model schema, simulator runtime, code authoring surface, and editor UI so typed token values can be created, simulated, displayed, and passed into user-authored Petrinaut code consistently.

🔗 Related links

🔍 What does this change?

  • Extends colour element types to real, integer, boolean, and uuid.
  • Adds typed token records using number | boolean | string values throughout Petrinaut core APIs.
  • Adds a token value codec for encoding booleans, integers, and UUIDs into the numeric frame buffers used by the simulator.
  • Carries UUID codec snapshots through worker frame payloads so frame readers can decode typed tokens for UI consumers.
  • Updates simulation build, transition kernels, lambda inputs, dynamics, Monte Carlo transition effects, frame readers, metrics, and visualizers to use typed token objects.
  • Restricts dynamics to real-valued attributes; discrete attributes remain unchanged by continuous dynamics.
  • Allows real and integer transition kernel outputs to use runtime distributions, while rejecting distributions for boolean and UUID outputs.
  • Updates scenario schemas and scenario compilation so coloured place rows can contain typed token values.
  • Updates generated default code, LSP virtual files, and AI guidance to describe the new token typing rules.
  • Adds type selection controls for Petrinaut dimensions in the properties panel.
  • Updates initial-state and scenario spreadsheet editors to support real, integer, boolean, and UUID cells.
  • Keeps boolean spreadsheet cells keyboard-navigable and fixes the dimension row layout so the name input remains visible next to the type selector.
  • Adds regression coverage for typed scenario row compilation, typed initial marking packing/decoding, and typed transition input/output encoding.
  • Adds a changeset for @hashintel/petrinaut and @hashintel/petrinaut-core.

Pre-Merge Checklist 🚀

🚢 Has this modified a publishable library?

This PR:

  • modifies an npm-publishable library and I have added a changeset file(s)

📜 Does this require a change to the docs?

The changes in this PR:

  • are internal and do not require a docs change

🕸️ Does this require a change to the Turbo Graph?

The changes in this PR:

  • do not affect the execution graph

🛡 What tests cover this?

  • yarn workspace @hashintel/petrinaut-core test:unit
  • yarn workspace @hashintel/petrinaut test:unit
  • yarn exec turbo run lint:tsc --filter '@hashintel/petrinaut-core' --filter '@hashintel/petrinaut'
  • yarn exec turbo run lint:eslint --filter '@hashintel/petrinaut-core' --filter '@hashintel/petrinaut'
  • yarn exec oxfmt --check $(git diff --name-only --diff-filter=ACM) libs/@hashintel/petrinaut-core/src/simulation/engine/token-values.ts
  • git diff --check

❓ How to test this?

  1. Open Petrinaut and select a type in the properties panel.
  2. Add or edit dimensions and switch their types between Real, Int, Bool, and UUID.
  3. Confirm the dimension name input remains visible beside the type selector.
  4. Add typed initial tokens or scenario token rows for a coloured place.
  5. Run a simulation and confirm typed token values are preserved in initial state, transition output, and visualizer/metric inputs.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hash Ready Ready Preview, Comment May 26, 2026 11:37pm
petrinaut Ready Ready Preview, Comment May 26, 2026 11:37pm
1 Skipped Deployment
Project Deployment Actions Updated (UTC)
hashdotdesign-tokens Ignored Ignored May 26, 2026 11:37pm

@cursor
Copy link
Copy Markdown

cursor Bot commented May 26, 2026

PR Summary

Medium Risk
Changes the simulation engine’s token encoding/decoding and transition/dynamics contracts; incorrect coercion or frame snapshots could cause subtle runtime mismatches, though new unit tests target the critical paths.

Overview
This PR adds discrete coloured-token attributes (integer, boolean, uuid) alongside continuous real values, and threads those types through the model, simulator, authoring surface, and editor.

Model & authoring: Colour elements and Zod/AI guidance now distinguish continuous real attributes (dynamics may update them) from discrete integer/boolean/uuid attributes (kernels and initial state only). Scenarios accept typed per-place rows; LSP virtual types and default snippet generators emit type-appropriate literals and dynamics derivatives (never on discrete fields).

Simulation runtime: Token objects are TokenRecord (number | boolean | string). A new TokenValueCodec coerces values, packs integers/booleans/UUIDs into frame buffers, and carries a UUID snapshot on worker frames so readers decode typed tokens for lambdas, kernels, metrics, Monte Carlo, and frame APIs. Dynamics apply derivatives only to real elements; distributions remain limited to numeric kernel outputs.

UI: Dimension type selectors, and initial-state/scenario spreadsheets that edit bool/UUID cells (not only numbers), feed the same typed marking shape used at run time.

Regression tests cover scenario row coercion, initial marking pack/decode, and typed transition I/O.

Reviewed by Cursor Bugbot for commit 27d3d08. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions github-actions Bot added area/infra Relates to version control, CI, CD or IaC (area) area/libs Relates to first-party libraries/crates/packages (area) type/eng > frontend Owned by the @frontend team area/apps > hash.design Affects the `hash.design` design site (app) labels May 26, 2026
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 27d3d08. Configure here.

for (const [placeId, value] of Object.entries(
scenario.initialState.content,
)) {
// Colored places: number[][] stored directly by the UI.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scenario compile throws on bad rows

Medium Severity

In per_place scenario initial state, colored token rows are converted via tokenRecordsFromRows, which calls coerceTokenRecord and can throw on invalid typed values (e.g. malformed UUIDs). Unlike expression and code-mode paths, this branch has no try/catch, so compileScenario may throw instead of returning structured compilation errors.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 27d3d08. Configure here.

if (value === undefined || value === null || value === "") {
return NIL_UUID;
}
if (typeof value !== "string" || !UUID_RE.test(value)) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:

User-supplied UUID input is validated with a regex (UUID_RE.test(value)) that could be vulnerable to ReDoS attacks if the pattern uses inefficient backtracking constructs.

More details about this

The UUID_RE regex is being used to validate user-supplied input via the value parameter in the coerceUuid() function. If UUID_RE is not carefully constructed, an attacker could provide a malicious input string that causes the regex engine to hang or consume excessive CPU time (ReDoS - Regular Expression Denial of Service).

Attack scenario: An attacker could call this validation function with a specially crafted string (for example, a very long string of characters that almost—but not quite—match the UUID pattern, like "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa@" repeated many times). When UUID_RE.test(value) executes against this string, the regex backtracking could cause the application to freeze, making the service unavailable to legitimate users.

The vulnerability depends on the actual regex pattern in UUID_RE (not shown here), but common ReDoS patterns include nested quantifiers like (a+)+ or overlapping alternatives that force excessive backtracking on non-matching input.

To resolve this comment:

✨ Commit fix suggestion

Suggested change
if (typeof value !== "string" || !UUID_RE.test(value)) {
function isHexDigit(char: string): boolean {
return (
(char >= "0" && char <= "9") ||
(char >= "a" && char <= "f") ||
(char >= "A" && char <= "F")
);
}
function validateUuid(value: string): true | string {
if (value.length !== 36) {
return "invalid length";
}
for (let index = 0; index < value.length; index += 1) {
const char = value[index];
if (index === 8 || index === 13 || index === 18 || index === 23) {
if (char !== "-") {
return "invalid hyphen placement";
}
continue;
}
if (!isHexDigit(char)) {
return "invalid character";
}
}
return true;
}
function coerceUuid(value: unknown, context: string): string {
if (value === undefined || value === null || value === "") {
return NIL_UUID;
}
if (typeof value !== "string") {
throw new Error(`${context} must be a UUID string.`);
}
const normalized = value.trim();
const validation = validateUuid(normalized);
if (validation !== true) {
throw new Error(`${context} must be a UUID string.`);
}
return normalized.toLowerCase();
}
View step-by-step instructions
  1. Replace the regex-based UUID check with a dedicated validator function so request data is not matched against a potentially expensive regex.

  2. Add a helper such as validateUuid(value: string): true | string that validates UUIDs with simple fixed checks instead of UUID_RE.test(...).
    For example, check value.length === 36, check hyphens at positions 8, 13, 18, and 23, and verify every other character is a hex digit with a small character check like char >= "0" && char <= "9" or char.toLowerCase() >= "a" && char.toLowerCase() <= "f".

  3. Update coerceUuid() to call the new helper after the type check.
    For example, change the condition from typeof value !== "string" || !UUID_RE.test(value) to typeof value !== "string" followed by const validation = validateUuid(value); if (validation !== true) { throw new Error(\${context} must be a UUID string.`); }`.

  4. Keep the existing normalization step and return the lowercase value after validation with return value.toLowerCase();.
    This preserves the current behavior while removing the regex denial-of-service risk.

  5. If this code accepts user input with surrounding whitespace, trim before validating by using const normalized = value.trim(); and validate normalized instead of the raw string. Then return normalized.toLowerCase() so valid values are stored consistently.

Alternatively, if the UUID must specifically be RFC 4122 v1-v5, add fixed character checks for the version and variant positions as part of validateUuid, such as restricting index 14 to 1-5 and index 19 to 8, 9, a, or b.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by regex_dos.

You can view more details about this finding in the Semgrep AppSec Platform.

if (value === undefined || value === null || value === "") {
return NIL_UUID;
}
if (typeof value !== "string" || !UUID_RE.test(value)) {
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/apps > hash.design Affects the `hash.design` design site (app) area/infra Relates to version control, CI, CD or IaC (area) area/libs Relates to first-party libraries/crates/packages (area) type/eng > frontend Owned by the @frontend team

Development

Successfully merging this pull request may close these issues.

2 participants