Skip to content

feat(atelet): expose actor identity via /run/ate/actor-id#184

Open
Frederick F. Kautz IV (fkautz) wants to merge 2 commits into
agent-substrate:mainfrom
fkautz:feat/actor-identity-file
Open

feat(atelet): expose actor identity via /run/ate/actor-id#184
Frederick F. Kautz IV (fkautz) wants to merge 2 commits into
agent-substrate:mainfrom
fkautz:feat/actor-identity-file

Conversation

@fkautz

@fkautz Frederick F. Kautz IV (fkautz) commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Fixes #178

Gives an actor a reliable way to learn its own ID without parsing the
Host header: atelet bind-mounts a read-only, per-actor identity
directory at /run/ate, containing the file actor-id.

Per the discussion on #178, the mount is a directory rather than a
single file so that further late-bound data (more downward-API fields,
config, eventually credentials) can land as sibling files without
changing the mount shape.

Why not an environment variable

An env var (or a file baked into the image) cannot carry per-actor
identity. Both live in the checkpointed process memory / overlay, which
is captured once in the golden snapshot. On the normal activation path
(runsc restore from that snapshot), every actor of a template would
report the golden actor's ID instead of its own. This was verified on a
live cluster: the env approach reports the golden ID after restore, while
this mount approach reports the correct per-actor ID.

What changed (atelet)

  • Populates a per-actor identity directory (currently just actor-id)
    and bind-mounts it read-only at /run/ate when assembling each app
    container's OCI bundle. The bundle is regenerated on every resume, and
    external bind mounts are re-attached from the restore-time
    config.json, so the value stays correct per-actor through a snapshot
    restore. The pause container gets no identity mount. The directory is
    cleared and recreated with the other per-actor dirs, and actor-id is
    written atomically (temp file + fsync + rename + parent fsync) since
    the directory is visible to a possibly-running actor.
  • The mount point is created via os.Root, so a symlink planted in the
    image cannot redirect the write outside the extracted rootfs.
  • Path-input validation at the RPC boundary comes from fix(atelet): validate RPC inputs that become host filesystem paths #206 (merged);
    the copy of it this branch previously carried was deduplicated in the
    rebase, so this PR is now purely the identity feature.

Testing

  • Unit tests: identity mount presence/absence, symlink-escape refusal,
    atomic-write semantics (overwrite on resume, no leftover temp files),
    and the router-client helpers.
  • E2E (internal/e2e/suites/identity): restores two actors from one
    golden snapshot and asserts each observes its own ID and not the golden
    ID, which is the one property unit tests and config.json inspection
    cannot catch. It introduces a minimal probe actor fixture (/whoami)
    and a router HTTP client for the harness, and passes against a live kind
    cluster on current main.

Usage (for actor authors)

Read /run/ate/actor-id fresh on use rather than caching it at startup.
The file holds the raw actor ID with no trailing newline. Documented in
docs/api-guide.md.

  • Tests pass
  • Appropriate changes to documentation are included in the PR

@MushuEE

Copy link
Copy Markdown
Collaborator

Do we want to expose more to the actor? Like labels? Or do we want only immutable metadata?

Comment thread cmd/atelet/oci.go
@fkautz

Copy link
Copy Markdown
Contributor Author

Do we want to expose more to the actor? Like labels? Or do we want only immutable metadata?

My intention for the probe is to expose more information so we can test the environment. That'll be in future patches.

I'm not the right person to answer the broader question i.r.t. what should be exposed through the mount. We could migrate it to a directory and populate it.

@fkautz

Copy link
Copy Markdown
Contributor Author

Since the last review:

  • The mount is now the directory /run/ate (read-only), with the ID in actor-id inside it, per the Add ActorID to the injected actor's process #178 discussion. This means additional metadata can be added as sibling files later without changing the mount shape. Grant McCloskey (@MushuEE) this is also the answer to your earlier question: labels or other immutable (or slow-changing) metadata could land here as new files, and what exactly gets exposed is being worked out in Add ActorID to the injected actor's process #178.
  • Follow-ups from a second review pass: the identity directory is now cleared and recreated in resetActorDirs with the other per-actor dirs, and actor-id is written atomically (temp file + fsync + rename) since the directory is visible to a possibly-running actor.
  • Rebased onto current main.

Verified end-to-end on a kind cluster: the identity e2e restores two actors from one golden snapshot and each reads its own ID (not the golden's) from /run/ate/actor-id. The workflow run needs a maintainer approval for CI to confirm.

Gives an actor a reliable way to learn its own ID without parsing the
Host header. atelet populates a per-actor identity directory (currently
the single file actor-id) and bind-mounts it read-only at /run/ate into
each app container; the pause container gets none.

- A directory rather than a single-file mount, per the discussion on
  agent-substrate#178, so future late-bound data can land as sibling files without
  changing the mount shape.
- The bundle is regenerated on every resume and external bind mounts are
  re-attached from the restore-time config.json, so the value stays
  correct per-actor through a golden-snapshot restore; an env var (or a
  file baked into the image) would be frozen at the golden actor's ID,
  since it lives in the checkpointed process memory.
- actor-id is written atomically (temp file, fsync, rename, parent
  fsync): the directory is visible to a possibly-running actor, so a
  reader must never observe a torn value.
- The identity dir is cleared and recreated with the other per-actor
  dirs in resetActorDirs.
- The in-rootfs mount point is created via os.Root, so a symlink planted
  in the image cannot redirect the write outside the extracted rootfs.

Documented for actor authors in docs/api-guide.md: read it fresh on use;
raw ID, no trailing newline.
The one property unit tests cannot catch is identity captured in the
golden snapshot: the env-var approach passed unit tests and config.json
inspection yet returned the golden actor's ID after restore. The
identity suite restores TWO actors from one golden snapshot and asserts
each observes its own ID, and explicitly not the golden's, via the probe
actor's /whoami, reached through the atenet router the way real traffic
arrives.

- probe: minimal introspection actor for e2e suites; reports the
  identity file at request time and surfaces read errors in the
  response, so a failing assertion explains itself.
- RouterClient: port-forwards the atenet router service and routes on
  resources.ActorDNSSuffix.
- deployProbe renders the manifest template in Go and execs the pinned
  ko/kubectl directly with argument slices: no shell, nothing to quote.
  KO_CONFIG_PATH must point at the repo root because ko resolves
  .ko.yaml from its working directory; without it the build silently
  loses defaultPlatforms and produces amd64-only images that cannot
  start on arm64 nodes.
@fkautz

Copy link
Copy Markdown
Contributor Author

rebased onto main and squashed to two commits (feature, e2e verification). nothing functionally new since the last review except review feedback; notable diffs from what was previously pushed:

  • the DNS-1123 validation this branch carried was deduplicated into fix(atelet): validate RPC inputs that become host filesystem paths #206, which merged; this PR is now purely the identity feature
  • actor-id is written atomically and the identity dir is reset with the other per-actor dirs
  • probe deploy no longer shells out, and the probe reports identity read errors so e2e failures explain themselves

re-verified after the rebase: unit tests, make verify/test, and the kind e2e identity suite (two actors from one golden snapshot each observing their own id) against current main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ActorID to the injected actor's process

3 participants