Skip to content

feature - implement RFC 022 format functions (#39)#49

Open
dannymeijer wants to merge 12 commits into
mainfrom
feature/39-rfc022-format-functions
Open

feature - implement RFC 022 format functions (#39)#49
dannymeijer wants to merge 12 commits into
mainfrom
feature/39-rfc022-format-functions

Conversation

@dannymeijer
Copy link
Copy Markdown
Collaborator

@dannymeijer dannymeijer commented May 26, 2026

Summary

  • completes RFC 022's scalar format-function surface: deterministic hashes, URL helpers, JSON payload helpers, and CSV payload helpers
  • adds registry-backed helpers and Substrait extension mappings for concrete format helpers, with sha2(...) kept as a compatibility rewrite over concrete SHA-2 helpers
  • executes the full format helper set through the DataFusion adapter using native DataFusion functions where available and Incan-authored UDF callbacks for non-native helpers
  • removes the package-local Rust inql-datafusion-format adapter crate; the remaining adapter logic is Incan source using Rust bridge imports only where needed
  • exposes explicit-schema JSON/CSV helpers through Incan model type parameters, e.g. from_json[Payload](...) and from_csv[Row](...)
  • records dynamic variant predicates as reserved/rejected until InQL has a variant value model, instead of exposing fake string predicates
  • marks RFC 022 Implemented and updates format reference docs, release notes, and the RFC index

Type of change

  • Bug fix
  • New feature
  • Refactor / maintenance
  • Documentation
  • RFC (adds/updates docs/rfcs/*)

Area(s)

  • Package & tests
  • Specification (RFCs)
  • Documentation
  • Automation & repo config
  • Other

Key details

  • User-facing behavior: Authors can use hash helpers (md5, sha1, concrete SHA-2 helpers, sha2, crc32, xxhash64), URL parse/encode/decode helpers, JSON validation/path/schema helpers, and CSV row/schema helpers in scalar projections.
  • Model-derived schemas: from_json[T](...), try_from_json[T](...), and from_csv[T](...) derive schema metadata from compiler-provided Incan model/class type reflection. The public API uses the model type parameter directly, not schema strings, dummy model values, or schema wrapper helpers.
  • Internals: Concrete helpers declare registry-owned Substrait extension mappings. DataFusion support is adapter-owned, but implemented in Incan source through src/session/datafusion_format_functions.incn rather than a package-local Rust crate.
  • Boundary: Dynamic variant predicates such as typeof, is_array, and is_object remain unavailable until a variant value model exists.
  • Compiler baseline: This branch expects Incan 0.3.0-rc28, which includes the generic decorated type-parameter reflection and Rust string-boundary fixes needed by the model-typed JSON/CSV helpers.

Testing / verification

Verified locally with Incan 0.3.0-rc28 from merged release/v0.3:

  • cargo build --bin incan in /Users/danny/Development/encero/incan
  • make test INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • make fmt-check INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • make test-style INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • make registry-metadata INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • make build INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • make smoke-consumer INCAN=/Users/danny/Development/encero/incan/target/debug/incan
  • git diff --check

Notes:

  • make test passed 175 package tests.
  • make registry-metadata passed with 130 helpers and still prints the known incan.lock is out of date package-root warning.
  • The first sandboxed make smoke-consumer attempt failed because DNS could not resolve index.crates.io; rerunning the same command with network access passed.

Docs impact

  • No docs changes needed
  • Docs updated

Links: docs/language/reference/functions/format.md, docs/language/reference/functions/index.md, docs/release_notes/v0_1.md, docs/rfcs/022_semi_structured_format_functions.md, docs/rfcs/README.md

Stack / Merge Order

Incan 0.3.0-rc28 is merged on release/v0.3; this PR is now unblocked by the compiler fixes. This PR is full RFC 022 scope and closes #39.

Refs #39
Refs dannys-code-corner/incan#708
Refs dannys-code-corner/incan#709
Refs dannys-code-corner/incan#714
Refs dannys-code-corner/incan#715
Refs dannys-code-corner/incan#716

@incan-triage-bot incan-triage-bot Bot added documentation Improvements or additions to documentation package Library source, tests, incan.toml specification docs/rfcs/ normative RFCs labels May 26, 2026
@dannymeijer dannymeijer force-pushed the feature/36-rfc019-window-functions branch from 083dfc2 to 298a26f Compare May 27, 2026 20:05
@dannymeijer dannymeijer force-pushed the feature/39-rfc022-format-functions branch from 4ebb788 to c786443 Compare May 28, 2026 07:10
@dannymeijer dannymeijer changed the title feature - implement RFC 022 hashing functions (#39) feature - implement RFC 022 format functions (#39) May 28, 2026
@dannymeijer dannymeijer changed the base branch from feature/36-rfc019-window-functions to main May 28, 2026 09:56
@dannymeijer dannymeijer marked this pull request as ready for review May 28, 2026 10:00
@incan-triage-bot incan-triage-bot Bot added the automation CI, Makefile, .github/, repo config label May 28, 2026
@dannymeijer dannymeijer self-assigned this May 29, 2026
@dannymeijer dannymeijer added this to the InQL-v0.1 milestone May 29, 2026
@dannymeijer dannymeijer force-pushed the feature/39-rfc022-format-functions branch from ca19b08 to 381de0b Compare May 30, 2026 08:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automation CI, Makefile, .github/, repo config documentation Improvements or additions to documentation package Library source, tests, incan.toml specification docs/rfcs/ normative RFCs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RFC 022: Semi-structured and format functions

1 participant