Skip to content

Allow tag-based ActorTemplate images#226

Draft
Eitan Yarmush (EItanya) wants to merge 1 commit into
agent-substrate:mainfrom
kagent-dev:relax-actortemplate-image-digests
Draft

Allow tag-based ActorTemplate images#226
Eitan Yarmush (EItanya) wants to merge 1 commit into
agent-substrate:mainfrom
kagent-dev:relax-actortemplate-image-digests

Conversation

@EItanya

Copy link
Copy Markdown
Collaborator

Summary

Fixes #223.

  • relax ActorTemplate image validation so tags are accepted
  • persist resolved digest image refs in a versioned snapshot-manifest.json
  • restore snapshots using manifest-pinned image refs instead of re-resolving current tags
  • normalize object-storage missing-object errors for manifest fallback handling
  • reserve workload container name pause at the API level because atelet uses it for sandbox infrastructure

Testing

  • go test ./cmd/atelet ./internal/memorypullcache ./pkg/api/v1alpha1
  • GOCACHE=$PWD/.cache/gocache hack/update/go-generate.sh
  • hack/verify/crd-chart.sh
  • git diff --check

Full go test ./... was not clean in this sandbox: default run hit NO_COLOR-sensitive color tests; rerun with NO_COLOR unset and GOCACHE=/tmp/substrate-go-cache then failed on sandbox-restricted local TCP listeners (socket: operation not permitted) and setup-envtest.

@EItanya Eitan Yarmush (EItanya) marked this pull request as draft June 11, 2026 21:10
@google-cla

google-cla Bot commented Jun 11, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@EItanya Eitan Yarmush (EItanya) force-pushed the relax-actortemplate-image-digests branch from 24345a3 to 0fcddb6 Compare June 11, 2026 21:14
@EItanya Eitan Yarmush (EItanya) force-pushed the relax-actortemplate-image-digests branch from 0fcddb6 to 4937c62 Compare June 11, 2026 21:21
@dberkov

Dmitry Berkovich (dberkov) commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

relax ActorTemplate image validation so tags are accepted

To ensure OCI image immutability, ActorTemplate currently requires an explicit full SHA256. Using labels instead would jeopardize resume operations for SUSPENDED actors if the OCI image associated with that label changes.

As part of #119, a "homedir" snapshot concept is being introduced to allow resume operations to persist even when an OCI image does not match. Additionally, the proposal in #16 addresses an upgrade mode intended to support surviving OCI image transitions.

I recommend finalizing the upgrade flow first. This will clarify whether ActorTemplate should be treated as a mutable or immutable object before we implement further behavioral changes.

Julian Gutierrez Oschmann (@juli4n) - FYI

@EItanya

Eitan Yarmush (EItanya) commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

To ensure OCI image immutability, ActorTemplate currently requires an explicit full SHA256. Using labels instead would jeopardize resume operations for SUSPENDED actors if the OCI image associated with that label changes.

FWIW this doesn't actually get rid of that contract, it just sets that SHA256 into the contract on first pull rather than requiring the user to set it. It makes the system lock it rather than requiring the user search out that information. I understand the technical limitation, I was just trying to make the user workflow a tiny bit easier. I have found for myself that this adds friction.

As part of #119, a "homedir" snapshot concept is being introduced to allow resume operations to persist even when an OCI image does not match. Additionally, #16 (comment) addresses an upgrade mode intended to support surviving OCI image transitions.

I haven't read through #119 yet, but I will try to do so today. Using a homedir approach for flexible snapshots make sense, especially for agents who rely mostly on filesystems that way.

@BenTheElder

Copy link
Copy Markdown
Collaborator

FWIW this doesn't actually get rid of that contract, it just sets that SHA256 into the contract on first pull rather than requiring the user to set it. It makes the system lock it rather than requiring the user search out that information. I understand the technical limitation, I was just trying to make the user workflow a tiny bit easier. I have found for myself that this adds friction.

Yeah, the only downside is this might be surprising, since you specified a tag but it gets locked in.

Right now it's explicit that you must use an immutable reference which is annoying but should have unsurprising behavior.

Would we re-resolve on a new goldensnapshot?

If we were going full KRM, I would probably reflect the resolved image in the actortemplate status for visibility.

We could instead offer tooling to create actortemplates which does the resolve on creation of the actortemplate.

@BenTheElder

Copy link
Copy Markdown
Collaborator

I'm just thinking about how many times I've seen users trip over imagePullPolicy defaulting depending on if the image is explicitly or implicitly :latest in kubernetes verus literally any other reference.

I think a lot of users do not have a good grasp on tag lifecycle so while I'm not opposed in general we should think about how to make the behavior least-surprising. Requiring tag resolution be done explicitly up front is annoying but has ~unsurprising runtime behavior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow tag-based ActorTemplate images by persisting resolved digests with snapshots

3 participants