Inspiration

Experimenting with Meta Quest 3 depth sensors revealed a significant limitation: hardware sensors (ToF/LiDAR) lose accuracy after a few meters. I realized that no matter the distance, the sky is always the farthest point. Instead of raw depth maps, I utilized Semantic Segmentation to create "infinite depth" occlusion, allowing interactions with distant skylines.

What it does

Sky Canvas layers virtual content into the real world by extracting sky segmentation masks in real-time. By manipulating the depth buffer, the app ensures real-world buildings visually occlude virtual objects. This creates the illusion that virtual elements—spaceships or planets—are floating in the atmosphere, respecting the city's geometry.

How we built it

We used Unity and Niantic Lightship ARDK. To control the depth buffer, we built a custom rendering pipeline using a generic quad and a custom HLSL shader (AR_IntelligentBackdrop_V3).

1. Custom Shader & Sharpening We manually decode YUV camera data. To keep Passthrough crisp, we apply a Laplacian sharpening filter to the Luma channel before color conversion:

// Luma Sharpening
float laplacian = centerY * 4.0 - (up + down + left + right);
centerY += laplacian * _Sharpening;

2. Depth Priming Logic The core logic manipulates fragment depth (SV_Depth) based on the semantic mask. This forces the GPU to treat buildings as "near" (occluding everything) and the sky as "infinite" (behind everything).

$$ \text{Depth}_{pixel} = \begin{cases} \text{FarPlane}, & \text{if } \text{Mask} > \text{Threshold} \ \text{NearPlane}, & \text{if } \text{Mask} \leq \text{Threshold} \end{cases} $$

// Intelligent Depth Writing (Platform Agnostic)
if (isSky) {
    #if defined(UNITY_REVERSED_Z)
        o.depth = 0.0001; // Far
    #else
        o.depth = 0.9999; 
    #endif
} else {
    #if defined(UNITY_REVERSED_Z)
        o.depth = 1.0; // Near
    #else
        o.depth = 0.0; 
    #endif
}

3. Aspect Ratio Correction Neural networks rarely match screen dimensions. We wrote a C# controller to calculate a crop matrix, preventing the mask from drifting:

// Fix stretching between Neural Net and Screen
scaleY = camAspect / _currentMaskAspect;
offsetY = (1.0f - scaleY) / 2.0f;
quadRenderer.material.SetVector(_CropParamsId, new Vector4(1.0f, scaleY, 0.0f, offsetY));

Challenges we ran into

The Niantic model outputs segmentation for only a single eye (mono). Running inference twice for stereo rendering killed performance on mobile chipsets. We pivoted to a "Passthrough Quad" approach, rendering the background in mono while keeping virtual content in stereo to maintain smooth framerates.

Accomplishments that we're proud of

  • Infinite Occlusion: Surpassing the 5-meter limit of hardware sensors to interact with kilometer-distant objects.
  • UV Math: Solving the complex aspect ratio alignment between the camera feed and the neural network output.

What we learned

I deepened my knowledge of HLSL Z-buffer manipulation (ZTest Always, ZWrite On) and the importance of matrix transformations for aligning neural net textures with physical camera feeds.

What's next for Sky Canvas

We aim to train a custom, lightweight model (ONNX/Unity Sentis) for True Stereo Rendering. This will enable commercial "World Scale Ads" and large-scale immersive gaming where enemies emerge from real clouds.

Built With

Share this project:

Updates