Overview of OpenXR architecture and detailed explanation of important concepts and components

This paper briefly introduces the architecture of OpenXR native development APP, and introduces some key concepts in XR. Note: The contents of this article involving specific platforms/manufacturers can be obtained from public channels, and have nothing to do with the internal information of relevant platforms/manufacturers.

preface

This article is a note written a long time ago. I have time recently, so I sorted it out and made it public here.
Since the text is too long and there are few pictures, it is recommended to use eye protection mode for reading.

After reading this article, you can understand the following contents:

  • How OpenXR was born
  • OpenXR status quo
  • Access mode of OpenXR APP
  • What is the relationship between Loader and Runtime
  • How OpenXR Controls the APP Lifecycle
  • Instance , Session , Swapchain , XRSpace What is it?
  • What is Composer, and what are Layer Merge and ATW

What is not included in this article:

  • What does each function of OpenXR do
  • How to implement Runtime (see the open source project Monado)
  • Why XXXXX

development

First, briefly introduce the birth opportunity and current status of OpenXR.

old times

In ancient times, XR device manufacturers and software providers fought on their own, resulting in different XR rendering and data communication methods for different platforms (such as Mobile VR SDK, Valve OpenVR, etc.).

Although common game engines have been trying to smooth out development differences, these methods are also part of fragmentation (for example, the Unity XR SDK requires device manufacturers to implement its content in the form of provider plugin).

From the perspective of native development, if an app based on OpenVR (SteamVR) wants to be transplanted to the Android platform all-in-one machine, it needs to replace the SDK first, and then deal with a bunch of errors and platform differences. If the application does not seal its own "difference smoothing layer", then this change is no different from a big change.

OpenXR was born

Therefore, OpenXR was born. It is a common XR API led by Khronos and participated by many companies. It aims to smooth the difference (fragmentation) between hardware and platform, so that APP can run across multiple device manufacturers' hardware without major changes.

In other words, it stipulates a set of unified structures/functions/interfaces, so that XR APP interface calls, life cycle management, etc. follow unified standards. No matter what the hardware devices are, the function signatures (return values, method names, parameters) of APP and device communication are the same.

The internal implementation of truly smoothing the difference is in XR Runtime, which is built into each OS. APP deals with XR Runtime at runtime through the built-in OpenXR Loader program.

The hardware manufacturer is responsible for implementing XR Runtime. The underlying hardware interaction, ATW, Composer, etc. in the perspective of APP developers are a black box.

present situation

The ideal is beautiful, but the reality is cruel. Although all mainstream manufacturers have announced their support for OpenXR, APP development is far from ideal:

  • OpenXR Loader for mobile platform XR devices has no standard implementation and relies heavily on hardware manufacturers. However, the Loader needs to be packaged and distributed with the application, so it cannot be compiled once and run everywhere.
  • Hardware manufacturers through expansion( XR_ [equipment manufacturer] _FunctionName )Introduce non-standard hardware capabilities into OpenXR to build functional barriers in the hope of competing for the right to speak for future standardization (first, then play) and increasing fragmentation.

In addition, although OpenXR is a free (royalty free) standard, the official certification of OpenXR Runtime requires a fee. Otherwise, hardware manufacturers cannot call Runtime "adaptive/compliant", nor can they use OpenXR LOGO and other marks.

Of course, manufacturers are also trying to eliminate fragmentation. Taking a domestic manufacturer as an example, the new version of SDK last year abandoned some special extensions and turned to include FB/Meta extensions (but this in fact also increased the voice of FB, making FB extensions exist as "actual general standards").

Overview of OpenXR APP architecture

The working process of traditional 3D APP can be summarized with the following simple codes:

The overall architecture of OpenXR's 3D APP is much the same as the above simplified code. Although the figure below looks complex, it is also a big cycle in essence.

As can be seen from the figure above, the main purpose of OpenXR is to standardize the life cycle and call interface of XR APP. The application logic and rendering are interspersed in the corresponding stages of the life cycle, and are implemented by developers themselves.

In other words, although OpenXR Session needs to specify the graphics API when creating, OpenXR is not responsible for teaching you how to use the graphics API to draw, but only transferring the basic environment. Many VR players have a misconception that "the performance of a certain game will be good if OpenXR is replaced". In fact, the performance of the application depends mainly on the application developer, and less on the runtime efficiency. It has little to do with the OpenXR API.

Taking VD as an example, the streaming performance of Microsoft Simulation Flight 2020 under VDXR has significantly improved. This is because the VD OpenXR Runtime focuses on streaming, which is more efficient than the OpenXR Runtime built in SteamVR.

Relevant performance data can be seen from the video I uploaded: https://www.bilibili.com/video/BV1Nz4y1N7Wu

Important components and concepts of OpenXR

I will mainly talk about some components and concepts that I think are important.

Loader

A module included in the APP, which is responsible for loading the runtime library and maintaining the communication between the APP and OS XR Runtime.

From the perspective of developers, the OpenXR API we use essentially depends on the work of Loader.

Compositor

This is an underlying module of Runtime, not exposed to developers, and there is no unified implementation method.

The synthesizer is responsible for many things, but the closest thing to APP developers is image synthesis and ATW (Asynchronous Timewarp).

Quest Composer also has some functions such as AppSpacewarp, which is omitted.

Image synthesis

In addition to the Projection Layer, applications can also submit other layers (such as Quad Layer) at the same time, and can even use the Overlay extension to create Overlay Seesion to submit the Overlay Layer (such as the Launcher and Input Method and other global panels on Quest).

The synthesizer is responsible for merging multiple layers into a picture and sending it to the bottom layer for display.

ATW (Asynchronous Time Warp)

ATW is mainly used to reduce MTP (motion to photo) delay.

Assuming that it takes 8ms for the application to complete one frame of logic+image rendering, the device Pose acquired at the beginning of rendering has changed in large probability after 8ms (although the change is not obvious), so the rendered image will "slightly" lag behind the real perspective. If the game is too slow, the phenomenon that the image can't keep up with the action will be more obvious, thus aggravating the vertigo of the player.

After the APP finishes drawing and submits the frame image, the ATW in the synthesizer will distort the image according to the latest 3D Pose to make it match the current Pose as much as possible.

An obvious phenomenon: when the game is too stuck, some hardware will appear "black edges" when turning around. This is the work of ATW.

Although the image processed by ATW will be distorted, due to the short frame interval, the distortion is not obvious for games with normal frame rate.

Why Composer is important

Composer plays a very important role. As an intermediate layer connecting the preceding and the following, the Composer optimization of various manufacturers is the core technology point, and also the place most prone to differentiation.

The same game, the same frame rate, why is it so good to experience on your machine, and dizzy on other machines? In addition to the 6dof algorithm, circuit, hardware design and other factors, the role of Composer can not be ignored.

XrInstance

An object held by APP, which is used to communicate with OpenXR Runtime and connect APP and XR Runtime.

The OpenXR Loader in APP generally tracks the XrInstance and forwards function calls to the XR Runtime.

XrSession

For objects created through XrInstance, APP deals with XR through XrSeesion.

Control Lifecycle

APP controls its life cycle by monitoring the status of the session

Control Frame Loop

When the life cycle is in normal operation, APP uses xrWaitFrame / xrBeginFrame / xrEndFrame And other methods to control the FrameLoop.

In the figure xrWaitFrame Theoretically, threads (similar to Sleep) will be blocked, allowing applications to release CPU resources.

When the next RenderLoop arrives, the blocking will stop and the application logic will continue to execute.

The blocking time is an estimated value, which is controlled by the XR Runtime.

Swapchain

The function of OpenXR Swapchain is not very different from the common double buffer/triple buffer.

With the OpenXR interface and the Session as the medium, let the Runtime create the Swapchain resources for us.

After creating the Swapchain, obtain the index of the image used for drawing the current frame through xrAcquireSwapchainImage, and then use the xrWaitSwapchainImage Wait until the image is released by a possible read task, and then you can draw on the image.

After rendering and painting on the image, use the xrReleaseSwapchainImage Release control of the image.

XRSpace

The XRSpace of OpenXR is the coordinate system. The core space types are as follows:

  • XR_REFERENCE_SPACE_TYPE_VIEW: line of sight coordinate system
  • XR_REFERENCE_SPACE_TYPE_LOCAL: the most common coordinate system, which changes with the ReCenter (how to determine the origin depends on the Runtime: on some platforms, Pose when the instance was just created is used as the origin of the coordinate system)
  • XR_REFERENCE_SPACE_TYPE_STAGE: Room coordinate system, such as the designated play area on some platforms (not changing with manual ReCenter in theory)

There are also some platform related types:

  • XR_REFERENCE_SPACE_TYPE_UNBOUNDED_MSFT
  • XR_REFERENCE_SPACE_TYPE_COMBINED_EYE_VARJO

More familiar Floor coordinate system for developers using commercial game engines:

  • XR_REFERENCE_SPACE_TYPE_LOCAL_FLOOR_EXT: Similar to LOCAL, the X and Z are the same, but the Y coordinate is on the floor (the floor calibrated or estimated in the play space)

reference material

The following is the source of this article:

At the same time, this paper also uses the GPT robot capabilities of some platforms to abstract/summarize my notes.

Zimiao haunting blog (azimiao. com) All rights reserved. Please note the link when reprinting: https://www.azimiao.com/10307.html
Welcome to the Zimiao haunting blog exchange group: three hundred and thirteen million seven hundred and thirty-two thousand

Comment

*

*