SceneSense: Diffusion models for 3D Occupancy Synthesis from Partial Observation

Alec Reed
Brendan Crowe
Doncey Albin
Lorin Achey
Bradley Hayes
Chris Heckman
University of Colorado - Boulder

We present SceneSense, a novel generative 3D diffusion model for synthesizing 3D occupancy information from observations. SceneSense uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform, even when the geometry is occluded or out of view. The architecture of our framework ensures that the generative model never overwrites observed free or occupied space, making SceneSense a low risk addition to any robotic planning stack.

Photo example results Photo example results


Our occupancy in-painting method ensures that observed space remains intact while integrating SceneSense predictions. Drawing inspiration from image inpainting techniques like image diffusion and guided image synthesis, our approach continuously incorporates known occupancy information during inference. To execute occupancy in-painting, we select a portion of the occupancy map for diffusion, generating masks for occupied and unoccupied voxels. These masks guide the diffusion process to modify only relevant voxels while introducing noise at each step. This iterative process, depicted below, enhances scene predictions’ accuracy while preventing the model from altering observed geometry.

SceneSense Framework

Presentation Video


      title={SceneSense: Diffusion Models for 3D Occupancy Synthesis from Partial Observation}, 
      author={Alec Reed and Brendan Crowe and Doncey Albin and Lorin Achey and Bradley Hayes and Christoffer Heckman},