Generative objects: summon AI-generated objects onto the surfaces around you by salmanmkc · Pull Request #405 · google/xrblocks

salmanmkc · 2026-06-24T15:00:48Z

adds a GenerativeObjects primitive (xb.core.generative.imagine(prompt)) that turns a text/voice prompt into a placed, draggable object in your space. gemini generates an image, we key out the plain background into a clean cutout, and drop it onto the real-world surface you're looking at, occluded by depth and facing you.

fills a gap: image gen existed only as a low-level call (Gemini.generate), nothing turned a prompt into a placed, interactive object. it's the runtime verb the gem/canvas can compose but can't synthesize itself, ("create a thing you pinch and drag") as a one-liner, which I believe Ruofei wanted in the past

what's in it

primitive (src/generative/):

GenerativeObjects script + imagine(prompt, opts) places a GenerativeObject on the surface you're looking at. resolves null if AI is unavailable.
cutout: alpha-key the plain background so the subject reads cleanly, not as a white card.
grounding: raycast the depth mesh, stand objects on floors/tables, float them a few cm off walls so they don't blend in. falls back to in-front-of-camera with no hit.
occlusion: opts the material into the occlusion shader so real geometry hides it (the layer alone only builds the mask).
upright (yaw-only) billboard so it faces you like a standee.
draggable (translating) via the global DragManager.
optional experimental 2.5d relief (off by default): displaces a subdivided plane by the image brightness. approximate (brightness != depth), kept opt-in.

wired into Core/Options via enableGenerativeObjects() and exported from xrblocks.ts. pure helpers (scale, placement, facing, background keying) are unit-tested; 45 colocated tests, full suite green, build/lint/prettier clean.

demo (demos/generative_object/):

summon with on-screen buttons, a draggable spatial uiblocks panel that head-leashes to follow you, or your voice (push-to-talk).
gemini key entry overlay (?key= > localStorage > keys.json > prompt), same pattern as world_companion.

try it

serve the repo, open demos/generative_object/index.html, paste a gemini key, hit summon (or speak). objects land on the surface you're looking at; grab to move.

notes

client-side gen + ?key= is prototyping-only, same caveat as the other AI demos.
2.5d relief is experimental/opt-in (toggle in the demo); luminance != real depth so it's approximate. real relief would want a monocular depth model, future work.
grounding/occlusion need depth enabled; without it, placement falls back to in-front-of-camera and occlusion is skipped.

A prompt becomes a placed, draggable object: GenerativeObjects.imagine() asks the AI image model to generate an image, decodes it into a texture, and drops a billboard into the scene in front of the user, occludable by real-world depth. Pure helpers (aspect-preserving scale, place-in-front pose) and the orchestration are unit-tested with mocked AI + texture source. Wired into Core/Options via enableGenerativeObjects() and exported from xrblocks.ts.

Add keyOutBackground (pure, tested) and a browser CanvasBackgroundTextureSource that decodes the generated image, keys out the plain background, and returns a CanvasTexture so the subject reads as a cutout rather than a flat card. Gated by GenerativeOptions.removeBackground (on by default); Core swaps in the canvas source when enabled.

Speak or pinch to summon an AI-generated object into your space via xb.core.generative.imagine(); generated subjects are keyed to cutouts, placed in front of you, draggable, and occluded by real depth. Voice trigger via SpeechRecognizer; pinch cycles preset prompts so it works without a mic.

Add enableGenerativeObjects() to the options list, an xb.core.generative.imagine usage snippet, and a generative/ directory-map entry.

Add @google/genai to the importmap (the demo failed to init Gemini without it) and a key-entry overlay that resolves the key from ?key= > localStorage > keys.json > a prompt, matching the world_companion/objects_3d demos.

Every pinch/click summoned a new object, which fought with grabbing an existing one. Track whether a select started on an existing generative object and, if so, let DragManager move it instead of summoning. Add a keyboard 'G' summon for desktop where dragging uses the mouse.

Add a quaternionFacingCamera helper and a GenerativeObjects.update() that turns tracked objects to face the user each frame, gated by the new GenerativeOptions.billboard flag (on by default). Keeps the flat cutout from looking paper-thin from the side. Pure helper + billboard behavior unit-tested.

Add a netblocks-styled 🎙️ push-to-talk button (speech was undiscoverable before) and move the status HUD to the top-left so it no longer collides with the simulator's settings gear.

Add an opt-in GenerativeOptions.relief that builds the object as a densely subdivided plane displaced by the generated image's brightness (three.js displacementMap + bumpMap on a lit standard material) instead of a flat cutout, giving real shaded surface relief. Approximate (brightness is not true depth) and needs a light in the scene; default off. Structure unit-tested.

Press R to switch subsequently summoned objects between flat cutout and 2.5D relief (pausing billboarding so you can orbit the relief), and add ambient + directional lights so the lit relief material shows shading.

Raycast the camera forward against the depth mesh and place the object there: stand it on horizontal surfaces, float it off vertical ones so it doesn't blend into walls, falling back to in-front-of-camera with no hit. Also opt the material into the occlusion shader (the layer alone only builds the mask) so it's hidden behind real geometry.

Add a draggable uiblocks control panel (summon/speak/relief/clear) that head-leashes to follow the user, plus a top-right on-screen button bar and a push-to-talk voice button. Summoning is now via the controls/voice only (removed click-to-spawn), enable spatial UI + the depth texture for occlusion, and use the 'flare' icon for summon.

draggable=true alone wasn't enough: DragManager.beginDragging bails when there's no draggingMode, so grabbing never started. Set draggingMode to TRANSLATING.

ruofeidu · 2026-06-26T22:31:13Z

Hi Salman,

Thank you for your contribution in this!!! I would like to request to switch to a demo.
Also the demo ignores keys.json file I used to debug locally.

I won't say this is ready to be put inside the SDK (for now).
Even for inside SDK, it should be under ai.generateBillboard(image), etc.

We need to carefully think of the high-level picture of generativeAssets:

abstract --------> photorealistic
photo vs. mesh vs. 3D Gaussians etc.
LLM / local model vs. cloud models

Internally, we have a demo like this, but with better quality & confidential tech :)

salmanmkc · 2026-06-27T03:00:13Z

Hi Salman,

Thank you for your contribution in this!!! I would like to request to switch to a demo. Also the demo ignores keys.json file I used to debug locally.

I won't say this is ready to be put inside the SDK (for now). Even for inside SDK, it should be under ai.generateBillboard(image), etc.

We need to carefully think of the high-level picture of generativeAssets:

abstract --------> photorealistic photo vs. mesh vs. 3D Gaussians etc. LLM / local model vs. cloud models

Internally, we have a demo like this, but with better quality & confidential tech :)

you sure the keys.json wasn't in the repo root or the wrong folder? should be relative

and sure will move this to demo only, wow that's cool to know there's a better internal demo, is it sorta similar to likeness level quality?

Per review on google#405, the generative objects feature moves out of the SDK and becomes a demo. Remove the xb.core.generative subsystem, the Options enableGenerativeObjects()/GenerativeOptions, the barrel exports for the orchestrator/object/options/texture-source, and the SKILL.md references. BackgroundKeyer (pure RGBA chroma-key) and GenerativeObjectUtils (generic billboard/face-camera math) stay in src/ as small, unit-tested helpers the demo imports; the orchestration moves to the demo in the next commit.

The generative objects orchestration now lives in demos/generative_object/src/ (GenerativeObjects, GenerativeObject, GenerativeOptions, TextureSource) built by rollup like the drone/animalattack demos, instead of the SDK. The demo owns a GenerativeObjects script and adds it via xb.add() so dependency injection still resolves AI/camera/scene/depth. Fixes carried over from the review while moving: - only wire depth occlusion when depth is actually present, so objects don't render transparent against an empty occlusion map when depth is off - prefer the depth mesh's geometric face normal for surface orientation (the per-vertex normals are not kept fresh) and update the full-resolution mesh so placement raycasts hit current geometry - a generation token so an in-flight generate that resolves after clearObjects() is discarded instead of adding a stale object - build the relief displacement map lazily and dispose every distinct texture Also splits generation into generateBillboard(image), the image-to-object half, to sketch where an SDK ai.generateBillboard(image) could sit, and loads the built src/build/main.js with the keys.json root fallback.

Two reasons a summoned object could be hard to see: - groundOnSurface raycast placed it on whatever surface was ahead, so looking across the room dropped it on a far wall, tiny and easy to miss. Cap the grounding distance (maxGroundDistance, 2 m) and fall back to in-front placement when the surface is farther. - the image prompt asked for a white background, which the background keyer then cut out along with pale subjects (a white paper airplane vanished). Ask for a saturated chroma-green background instead so the corner-sampled keyer keeps any non-green subject.

The control buttons' idle/hover fill colors (#2a2a2a -> #3a3a3a) differed by only a few percent of brightness, so hovering produced no perceptible change. Use a dark chip for idle and a clear purple for hover (with a brighter click flash), matching the agent_hands demo.

…panel The depth mesh is in the scene for occlusion and surface placement, so the reticle's whole-scene raycast also hits it; standing within ~1m of a wall makes it the closest hit and grabs hover from the control panel. No-op the depth mesh's raycast so the reticle skips it, and restore the real raycast briefly inside raycastSurface_ so object placement still grounds on the geometry. Same approach as the agent_hands demo.

salmanmkc added 23 commits June 23, 2026 12:45

docs(SKILL): document the generative objects primitive

3557c41

Add enableGenerativeObjects() to the options list, an xb.core.generative.imagine usage snippet, and a generative/ directory-map entry.

demos(generative_object): enable Gemini (importmap + key entry)

4764527

Add @google/genai to the importmap (the demo failed to init Gemini without it) and a key-entry overlay that resolves the key from ?key= > localStorage > keys.json > a prompt, matching the world_companion/objects_3d demos.

demos(generative_object): push-to-talk button + move status HUD

1a22102

Add a netblocks-styled 🎙️ push-to-talk button (speech was undiscoverable before) and move the status HUD to the top-left so it no longer collides with the simulator's settings gear.

demos(generative_object): R toggles 2.5D relief + add scene lights

ca199b7

Press R to switch subsequently summoned objects between flat cutout and 2.5D relief (pausing billboarding so you can orbit the relief), and add ambient + directional lights so the lit relief material shows shading.

generative: keep billboards upright (yaw-only facing)

fc8498b

generative: add alpha-masked displacement map helper

6de2677

generative: build a relief displacement texture in the canvas source

313a15f

generative: use the masked displacement map for relief

a4230ff

generative: add groundOnSurface placement option

400e77e

generative: soften the default relief strength

7b86657

generative: make objects actually draggable (translating mode)

c080cae

draggable=true alone wasn't enough: DragManager.beginDragging bails when there's no draggingMode, so grabbing never started. Set draggingMode to TRANSLATING.

Merge branch 'main' into feat/generative-object

037833d

generative: trim verbose comments and docstrings

1febc5b

Merge branch 'main' into feat/generative-object

85db17e

Merge branch 'main' into feat/generative-object

94e41cb

dli7319 self-requested a review June 26, 2026 22:19

ruofeidu self-assigned this Jun 26, 2026

ruofeidu added the demo New demo for XR Blocks demonstrating novel interactivity or perception features. label Jun 26, 2026

ruofeidu marked this pull request as draft June 26, 2026 22:31

ruofeidu self-requested a review June 26, 2026 22:31

ruofeidu added the algorithm spatial algorithm label Jun 26, 2026

ruofeidu removed the algorithm spatial algorithm label Jun 26, 2026

ruofeidu added this to Agent Blocks Jun 26, 2026

github-project-automation Bot moved this to Todo in Agent Blocks Jun 26, 2026

ruofeidu removed this from Agent Blocks Jun 26, 2026

salmanmkc added 4 commits June 27, 2026 11:38

Merge branch 'main' into feat/generative-object

cd7da30

salmanmkc marked this pull request as ready for review June 27, 2026 05:42

salmanmkc added 2 commits June 28, 2026 10:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Generative objects: summon AI-generated objects onto the surfaces around you#405

Generative objects: summon AI-generated objects onto the surfaces around you#405
salmanmkc wants to merge 29 commits into
google:mainfrom
salmanmkc:feat/generative-object

salmanmkc commented Jun 24, 2026 •

edited

Loading

Uh oh!

ruofeidu commented Jun 26, 2026

Uh oh!

salmanmkc commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

salmanmkc commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

what's in it

try it

notes

Uh oh!

ruofeidu commented Jun 26, 2026

Uh oh!

salmanmkc commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

salmanmkc commented Jun 24, 2026 •

edited

Loading