logologoHomeBlog

Navigation

  • All Posts

Topics

  • Version History
  • Tips & Tricks

More

  • RSS Feed
  • Donate
  1. Home
  2. >
  3. Blog
  4. >
  5. Exploring Apple's Stereoscopic Video

Exploring Apple's Stereoscopic Video

March 12, 2024
howtoformats

Apple's recent introduction of stereoscopic ("3D") video storage format for its visionOS platform, the algorithmic core of Apple Vision Pro glasses, has sparked a surge in interest. Support for this format has begun to appear in various video tools such as encoders, muxers, and more. We have also integrated support for this format into the VTCLab Media Analyzer.

In essence, Apple has opted not to reinvent the wheel and has instead chosen the MV-HEVC (Multiview HEVC) standard as its foundation. This standard dictates storing stereo frames in a single track, as opposed to separate tracks for the left and right eye videos in traditional stereo encoding. This approach improves compression efficiency since images for the left and right eyes often bear strong similarities. Furthermore, this configuration simplifies frame synchronization.

Building upon MV-HEVC, Apple has added several descriptors to facilitate identifying and defining stereoscopic video settings in a more straightforward manner, without delving into intricate MV-HEVC stream parsing. While these descriptors can be used with other codecs besides MV-HEVC, we'll focus on the former for now.

Now, let's delve into the key similarities and differences between 3D video and standard video streams.

  • The container used is the conventional QuickTime / ISOMBFF.

  • The main HEVC headers (VPS / SPS / PPS) of the base layer are still housed in the 'hvcC' box (refer to the image below, highlighted in blue).

  • The main HEVC headers (VPS / SPS / PPS) of the additional layer (for the second eye) are located in the 'lhvC' box (refer to the image below, highlighted in green).

  • Frames of the layer 0 and layer 1 are interleaved. A frame from layer 1 always follows its corresponding frame from layer 0. Below, frames for layer 0 (left eye by default) marked with blue arrows and frames for layer 1 (right eye by default) marked with green.

  • There is a 'vexu' box ('VideoExtendedUsageBox'), containing an 'eyes' box ('StereoViewBox'), which in turn houses a 'stri' box ('StereoViewInformationBox').

  • 'vexu' and 'eyes' serve as simple containers, acting as wrappers for their child boxes.

  • The 'stri' box contains the actual data, with the most notable fields being:

    • has_right_eye_view
    • has_left_eye_view (self-explanatory)
    • eye_views_reversed: By default, video for the left eye precedes that for the right eye. A value of 1 indicates reverse order.
    • has_additional_views: This flag is set if there are other views besides the left and right eyes (e.g., a central view).

As per the eyes box specification, it may contain other boxs like the hero box (HeroStereoEyeDescriptionBox), which indicates the designated hero eye in stereo vision. If signaled, this suggests the other stereo eye view derives from the specified stereo eye and can be helpful in monoscopic viewing settings.

For further details, refer to the specification here: https://developer.apple.com/av-foundation/Stereo-Video-ISOBMFF-Extensions.pdf ↗ (opens in new tab).

With the foundation set by Apple and the broader industry's ongoing efforts, the future of stereoscopic video looks promising, offering exciting possibilities for immersive storytelling and interactive media experiences. Stay tuned for more updates and insights as we navigate the ever-evolving landscape of digital media and technology.

Enjoy this post? Subscribe via RSS to get notified about new articles.
Previous Post
New UI elements
Next Post
v0.5.1

Similar Posts

November 19, 2024•
formats

Decoding Media Metadata: How Tools and Devices Leave Their Fingerprints

Every media file contains subtle traces of its creation. Embedded within these files are unique fingerprints left by the tools, devices, and encoders used during production. These metadata elements a

September 25, 2023•
howto

How to check MPEG-TS program map table

Today we will learn how to quickly check MPEG-TS **program map table (PMT)** using VTCLab Media Analyzer. When you begin the analysis of MPEG-TS file, PMT may be a good start point. **It contains in

November 17, 2024

v0.5.6

Google spatial audio + sperical video, HEIC files basics, some more MP4 boxes, ...