WebXR 깊이 감지 모듈

1. 소개

Virtual Reality 및 Augmented Reality가 더 널리 보급됨에 따라, 사용자가 위치한 환경에 대한 더 자세한 정보에 접근할 수 있게 하는 새 기능들이 native API에 도입되고 있습니다. Depth Sensing API는 그러한 기능 중 하나를 WebXR Device API에 가져오며, WebXR 기반 경험의 작성자가 사용자의 장치에서 사용자 환경의 실제 세계 geometry까지의 거리에 대한 정보를 얻을 수 있게 합니다.

이 문서는 독자가 WebXR Device API 및 WebXR Augmented Reality Module 명세에 익숙하다고 가정합니다. 이 문서는 이들 위에 추가 기능을 제공하여 XRSessions에 기능을 더하기 때문입니다.

1.1. 용어

이 문서는 Augmented Reality를 나타내기 위해 AR, Virtual Reality를 나타내기 위해 VR이라는 약어를 사용합니다.

이 문서는 XR 장치에서 반환되거나 API 자체에서 반환되는 depth 정보를 포함하는 byte array를 가리킬 때 "depth buffer", "depth buffer data" 및 "depth data"와 같은 용어를 서로 바꿔 사용합니다. depth buffer의 구체적인 내용에 대한 자세한 정보는 명세의 data 및 texture entries에서 확인할 수 있습니다.

이 문서는 view의 왼쪽 위 모서리에 원점을 가지고, X축은 오른쪽으로 증가하며, Y축은 아래쪽으로 증가하는 coordinate system을 가리킬 때 normalized view coordinates라는 용어를 사용합니다.

2. 초기화

2.1. Feature descriptor

애플리케이션은 적절한 feature descriptor를 전달하여 XRSession에서 depth sensing이 활성화되도록 요청할 수 있습니다. 이 module은 depth sensing feature를 위한 새 valid feature descriptor로 새 문자열 depth-sensing을 도입합니다.

장치가 capable of supporting depth sensing feature라고 할 수 있으려면, 장치가 native depth sensing capability를 노출해야 합니다. inline XR device는 depth sensing feature를 capable of supporting하는 것으로 취급해서는 안 됩니다.

Depth sensing feature는 feature policy의 적용을 받으며, 요청 문서의 origin에서 "xr-spatial-tracking" policy가 허용되어야 합니다.

2.2. 의도된 depth type, data usage 및 data format

enum XRDepthType {
  "raw",
  "smooth",
};

"raw"의 사용은 depth data에 추가 처리를 수행해서는 안 됨을 나타냅니다.
"smooth"의 사용은 runtime이 잠재적인 noise를 제거하기 위해 depth texture에 추가 처리를 수행해야 함을 나타냅니다.

enum XRDepthUsage {
  "cpu-optimized",
  "gpu-optimized",
};

"cpu-optimized"의 사용은 XRCPUDepthInformation interface와 상호작용하여 depth data를 CPU에서 사용하도록 의도했음을 나타냅니다.
"gpu-optimized"의 사용은 XRWebGLDepthInformation interface와 상호작용하여 depth data를 GPU에서 사용하도록 의도했음을 나타냅니다.

enum XRDepthDataFormat {
  "luminance-alpha",
  "float32",
  "unsigned-short",
};

"luminance-alpha" 또는 "unsigned-short"의 data format은 API에서 얻은 depth data buffer의 항목이 16 bit unsigned integer 값임을 나타냅니다.
"float32" data format은 API에서 얻은 depth data buffer의 항목이 32 bit floating point 값임을 나타냅니다.

다음 표는 다양한 data format을 소비할 수 있는 방법을 요약합니다:

Data format	`GLenum` 값 equivalent	Depth buffer entry의 크기	CPU에서의 사용	GPU에서의 사용
`"luminance-alpha"`	LUMINANCE_ALPHA	8 bit의 2배	`data`를 `Uint16Array`로 해석	Luminance 및 Alpha channel을 검사하여 단일 값을 재조립합니다.
`"float32"`	R32F	32 bit	`data`를 `Float32Array`로 해석	Red channel을 검사하고 그 값을 사용합니다.
`"unsigned-short"`	R16UI	16 bit	`data`를 `Uint16Array`로 해석	Red channel을 검사하고 그 값을 사용합니다.

2.3. Session configuration

dictionary XRDepthStateInit {
  required sequence<XRDepthUsage> usagePreference;
  required sequence<XRDepthDataFormat> dataFormatPreference;
  sequence<XRDepthType> depthTypeRequest;
  boolean matchDepthView = true;
};

usagePreference는 session에 대해 원하는 depth sensing usage를 설명하는 데 사용되는 XRDepthUsage들의 ordered sequence입니다.

dataFormatPreference는 session에 대해 원하는 depth sensing data format을 설명하는 데 사용되는 XRDepthDataFormat들의 ordered sequence입니다.

depthTypeRequest는 session에 대해 원하는 depth sensing type을 설명하는 데 사용되는 XRDepthType들의 ordered sequence입니다. 이 요청은 사용자 에이전트가 무시할 수 있습니다.

matchDepthView는 depth information의 view가 XRView와 정렬되어야 함을 요청합니다. 이것이 true이면, XRSystem은 현재 frame을 반영하는 depth information을 반환하는 것이 좋습니다. 이것이 false이면, XRSystem은 이전 시점에 capture된 depth information을 반환할 수 있습니다.

NOTE: matchDepthView가 false이면, 작성자는 XRDepthInformation의 view를 사용하여 reprojection을 수행하는 것이 좋습니다.

XRSessionInit dictionary는 새 depthSensing key를 추가하여 확장됩니다. 이 key는 XRSessionInit에서 optional이지만, depth-sensing이 requiredFeatures 또는 optionalFeatures 중 하나에 포함된 경우에는 반드시 제공되어야 합니다.

partial dictionary XRSessionInit {
  XRDepthStateInit depthSensing;
};

Depth sensing feature가 required feature이지만 애플리케이션이 depthSensing key를 제공하지 않은 경우, 사용자 에이전트는 이를 unresolved required feature로 취급하고 requestSession(mode, options) promise를 NotSupportedError로 reject해야 합니다. optional feature로 요청된 경우, 사용자 에이전트는 feature request를 무시하고 새로 생성된 session에서 depth sensing을 활성화하지 않아야 합니다.

Depth sensing feature가 required feature이지만 XRDepthStateInit으로 호출된 finding supported configuration combination 알고리즘의 결과가 null이면, 사용자 에이전트는 이를 unresolved required feature로 취급하고 requestSession(mode, options) promise를 NotSupportedError로 reject해야 합니다. optional feature로 요청된 경우, 사용자 에이전트는 feature request를 무시하고 새로 생성된 session에서 depth sensing을 활성화하지 않아야 합니다.

Depth sensing이 활성화된 상태로 XRSession이 생성되면, depthUsage, depthDataFormat, 및 depthType attribute는 XRDepthStateInit으로 호출된 finding supported configuration combination 알고리즘의 결과로 설정되어야 합니다. depthActive는 기본적으로 true여야 합니다.

Note: 이 알고리즘의 의도는 preferences를 가장 제한적인 것에서 가장 덜 제한적인 것으로 처리하는 것입니다. 따라서 먼저 단일 항목만 표시된 항목을 처리하고, 그다음 여러 항목, 마지막으로 preference가 표시되지 않은 경우를 처리합니다.

depthStateInit dictionary가 주어졌을 때 depth sensing API를 위한 supported configuration combination을 찾기 위해, 사용자 에이전트는 다음 알고리즘을 실행해야 합니다:

depthTypeRequest를 depthStateInit 안의 depthTypeRequest key에 포함된 값으로 둡니다. 이것이 설정되어 있지 않으면 빈 sequence로 둡니다.
selectedType을 null로 둡니다
usagePreference를 depthStateInit 안의 usagePreference key에 포함된 값으로 둡니다
selectedUsage를 null로 둡니다.
dataFormatPreference를 depthStateInit 안의 dataFormatPreference key에 포함된 값으로 둡니다
selectedDataFormat를 null로 둡니다.
processingOrder를 (preferences, selection) 쌍의 sequence로 둡니다. 여기서 selection은 이전 단계에서 도입된 변수 중 하나에 대한 reference입니다: [(depthTypeRequest, selectedType), (usagePreference,selectedUsage),(dataFormatPreference,selectedDataFormat)]
processingOrder 안의 각 (preferences, selection)에 대해 다음 단계를 수행합니다
1. preferences가 단일 값만 포함하면, selection을 그 값으로 설정합니다.
processingOrder 안의 각 (preferences, selection)에 대해 다음 단계를 수행합니다:
1. selection이 null이 아니면, 다음 entry로 계속합니다.
2. preferences sequence가 비어 있으면, 다음 entry로 계속합니다.
3. preferences 안의 각 preference에 대해 다음 단계를 수행합니다:
4. preference와 selectedType,selectedUsage,selectedDataFormat의 다른 값들이 장치의 native depth sensing capabilities에 의해 supported depth sensing configuration으로 간주되지 않으면, 다음 entry로 계속합니다.
5. selection을 preference로 설정하고 이 nested steps를 중단합니다.
processingOrder 안의 각 (preferences, selection)에 대해 다음 단계를 수행합니다:
1. selection이 null이 아니면, 다음 entry로 계속합니다.
2. selection을 selectedType,selectedUsage,selectedDataFormat의 다른 값들과 함께 preferred native depth sensing capability에 의해 결정된 값으로 설정합니다.
selectedType,selectedUsage,selectedDataFormat 중 하나라도 null이면, null을 반환하고 이 단계를 중단합니다.
selectedType,selectedUsage,selectedDataFormat이 장치의 native depth sensing capabilities에 의해 supported depth sensing configuration으로 간주되면, selectedType,selectedUsage,selectedDataFormat의 depth sensing configuration을 반환하고 이 단계를 중단합니다.
depthTypeRequest가 빈 목록이 아니면, 이를 빈 목록으로 설정하고 이 단계를 반복합니다.
null을 반환하고 이 단계를 중단합니다.

Note: 사용자 에이전트는 usage와 data format의 모든 기존 조합을 지원할 필요는 없습니다. 이는 사용자 에이전트가 효율적인 방식으로 data를 제공할 수 있도록 하기 위한 것이며, underlying platform에 따라 달라집니다. 이 결정은 애플리케이션 개발자에게 추가 부담을 줍니다. API 복잡성을 숨기는 라이브러리를 만들면 이를 완화할 수 있지만, 성능을 희생할 가능성이 있습니다.

Depth sensing API를 capable of supporting하는 사용자 에이전트는 최소한 하나의 XRDepthUsage mode를 지원해야 합니다. Depth sensing API를 capable of supporting하는 사용자 에이전트는 "luminance-alpha" data format을 지원해야 하며, 다른 format도 지원할 수 있습니다.

다음 코드는 depth sensing API가 필요한 session을 어떻게 요청할 수 있는지를 보여줍니다. 이 예제는 호출자가 CPU- 및 GPU-optimized usage 모두와 "luminance-alpha" 및 "float32" format 모두를 처리할 수 있으며, CPU와 "luminance-alpha"를 선호한다고 가정합니다:

const session = await navigator.xr.requestSession("immersive-ar", {
  requiredFeatures: ["depth-sensing"],
  depthSensing: {
    usagePreference: ["cpu-optimized", "gpu-optimized"],
    dataFormatPreference: ["luminance-alpha", "float32"],
  },
});

partial interface XRSession {
  readonly attribute XRDepthUsage depthUsage;
  readonly attribute XRDepthDataFormat depthDataFormat;
  readonly attribute XRDepthType? depthType;
  readonly attribute boolean? depthActive;

  undefined pauseDepthSensing();
  undefined resumeDepthSensing();
};

depthUsage는 session이 구성된 depth sensing usage를 설명합니다. Depth sensing이 활성화되지 않은 session에서 이 attribute에 접근하면, 사용자 에이전트는 InvalidStateError를 throw해야 합니다.

depthDataFormat는 session이 구성된 depth sensing data format을 설명합니다. Depth sensing이 활성화되지 않은 session에서 이 attribute에 접근하면, 사용자 에이전트는 InvalidStateError를 throw해야 합니다.

depthType는 session이 구성된 depth sensing type을 설명합니다. Depth sensing이 활성화되지 않은 session에서 이 attribute에 접근하면, 사용자 에이전트는 InvalidStateError를 throw해야 합니다. Runtime이 단일 XRDepthType만 지원하거나 그 밖의 방식으로 depthTypeRequest를 무시한 경우, 이는 null을 반환할 수 있습니다.

depthActive는 현재 depth sensing active state를 반환합니다. Depth sensing이 활성화되지 않은 session에서 이 attribute에 접근하면, 사용자 에이전트는 InvalidStateError를 throw해야 합니다. 이 값이 false이면, 사용자 에이전트는 depth data를 얻으려는 시도를 reject해야 합니다. 이 값이 true이면 사용자 에이전트는 유효한 depth data 또는 null을 반환할 수 있습니다.

resumeDepthSensing()이 XRSession session에서 호출되면, User Agent는 다음 단계를 실행해야 합니다:

session의 ended 값이 true이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame을 session의 animation frame으로 둡니다.
frame의 active boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depth-sensing feature descriptor가 session의 XR device의 session mode용 list of enabled features에 contained되어 있지 않으면, NotSupportedError를 throw하고 이 단계를 중단합니다.
depth sensing active state가 true이면, 이 단계를 중단합니다.
depth sensing active state를 true로 설정합니다.

pauseDepthSensing()이 XRSession session에서 호출되면, User Agent는 다음 단계를 실행해야 합니다:

session의 ended 값이 true이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame을 session의 animation frame으로 둡니다.
frame의 active boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depth-sensing feature descriptor가 session의 XR device의 session mode용 list of enabled features에 contained되어 있지 않으면, NotSupportedError를 throw하고 이 단계를 중단합니다.
depth sensing active state가 false이면, 이 단계를 중단합니다.
depth sensing active state를 false로 설정합니다.

3. Depth data 얻기

3.1. XRDepthInformation

[SecureContext, Exposed=Window]
interface XRDepthInformation {
  readonly attribute unsigned long width;
  readonly attribute unsigned long height;

  [SameObject] readonly attribute XRRigidTransform normDepthBufferFromNormView;
  readonly attribute float rawValueToMeters;
};

XRDepthInformation includes XRViewGeometry;

width attribute는 depth buffer의 width(즉 column 수)를 포함합니다.

height attribute는 depth buffer의 height(즉 row 수)를 포함합니다.

normDepthBufferFromNormView attribute는 depth buffer에 index할 때 적용되어야 하는 XRRigidTransform을 포함합니다. matrix가 나타내는 transformation은 coordinate system을 normalized view coordinates에서 normalized depth buffer coordinates로 변경하며, 이후 depth buffer의 width 및 height로 scaling하여 absolute depth buffer coordinates를 얻을 수 있습니다.

Note: 애플리케이션이 결과 depth buffer를 mesh에 대한 texturing에 사용하려는 경우, mesh vertices의 texture coordinates가 normalized view coordinates로 표현되었는지, 또는 적절한 coordinate system change가 shader에서 수행되는지를 보장하도록 주의해야 합니다.

rawValueToMeters attribute는 meter 단위의 depth를 얻기 위해 depth buffer의 raw depth value에 곱해야 하는 scale factor를 포함합니다.

transform은 연결된 view의 reference space 안에서 제공됩니다.

sensor가 연결된 view와 정렬되어 있으면, XRViewGeometry에서 포함된 모든 값은 연결된 view가 반환하는 값과 동일한 값을 반환해야 합니다.

각 XRDepthInformation에는 연결된 view가 있으며, 이는 sensor에 가장 가까운 XRView이고, XRDepthInformation을 retrieve하는 데 사용됩니다.

각 XRDepthInformation에는 연결된 sensor가 있으며, 이는 depth information이 얻어진 XRViewGeometry의 containing object입니다.

각 XRDepthInformation에는 depth buffer data를 포함하는 연결된 depth buffer가 있습니다. 서로 다른 XRDepthInformation들은 depth buffer 안에 서로 다른 concrete type의 객체를 저장할 수 있습니다.

XRDepthInformation 또는 이를 상속하는 모든 interface의 depth buffer에 접근하려고 할 때, 사용자 에이전트는 다음 단계를 실행해야 합니다:

depthInformation을 member에 접근한 instance로 둡니다.
view를 depthInformation의 view로 둡니다.
frame을 view의 frame으로 둡니다.
frame이 active가 아니면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame이 animationFrame이 아니면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depthInformation의 member에 접근하는 데 필요한 일반 단계를 계속 진행합니다.

3.2. XRCPUDepthInformation

[Exposed=Window]
interface XRCPUDepthInformation : XRDepthInformation {
  [SameObject] readonly attribute ArrayBuffer data;

  float getDepthInMeters(float x, float y);
};

data attribute는 필요한 경우 WebGL texture에 upload하기 적합한 raw format의 depth buffer information을 포함합니다. data는 padding 없이 row-major format으로 저장되며, 각 entry는 sensor의 near plane에서 users' environment까지의 거리에 대응하며, 단위는 지정되지 않습니다. 각 data entry의 크기와 type은 depthDataFormat에 의해 결정됩니다. 값은 rawValueToMeters를 곱하여 unspecified units에서 meters로 변환할 수 있습니다. normDepthBufferFromNormView는 normalized view coordinates에서 depth buffer의 coordinate system으로 transform하는 데 사용할 수 있습니다. 접근 시, depth buffer에 접근하는 알고리즘을 실행해야 합니다.

Note: 애플리케이션은 data array의 contents를 변경하려고 해서는 안 됩니다. 이는 getDepthInMeters(x, y) method가 반환하는 결과가 잘못될 수 있기 때문입니다.

getDepthInMeters(x, y) method는 coordinates에서 depth를 얻기 위해 사용할 수 있습니다. 호출 시, depth buffer에 접근하는 알고리즘을 실행해야 합니다.

getDepthInMeters(x, y) method가 XRCPUDepthInformation depthInformation에서 x, y로 호출되면, 사용자 에이전트는 다음 단계를 실행하여 coordinates에서 depth를 얻어야 합니다:

view를 depthInformation의 view로, frame을 view의 frame으로, session을 frame의 session으로 둡니다.
x가 1.0보다 크거나 0.0보다 작으면, RangeError를 throw하고 이 단계를 중단합니다.
y가 1.0보다 크거나 0.0보다 작으면, RangeError를 throw하고 이 단계를 중단합니다.
normalizedViewCoordinates를 space 안의 3-dimensional point를 나타내는 vector로 둡니다. 여기서 x coordinate는 x로, y coordinate는 y로, z coordinate는 0.0으로, w coordinate는 1.0으로 설정됩니다.
normalizedDepthCoordinates를 normalizedViewCoordinates vector에 depthInformation의 normDepthBufferFromNormView를 왼쪽에서 premultiply한 결과로 둡니다.
depthCoordinates를 normalizedDepthCoordinates를 scaling한 결과로 둡니다. 여기서 x coordinate는 depthInformation의 width로 곱하고 y coordinate는 depthInformation의 height로 곱합니다.
column을 depthCoordinates의 x coordinate 값으로 둡니다. 이를 integer로 truncate하고 [0, width-1] integer range로 clamp합니다.
row를 depthCoordinates의 y coordinate 값으로 둡니다. 이를 integer로 truncate하고 [0, height-1] integer range로 clamp합니다.
index를 row에 width를 곱하고 column을 더한 값으로 둡니다.
byteIndex를 index에 depth data format의 크기를 곱한 값으로 둡니다.
rawDepth를 data의 byteIndex index에서 찾은 값으로 둡니다. 이는 session의 depthDataFormat에 따라 number로 해석됩니다.
rawValueToMeters를 depthInformation의 rawValueToMeters와 같게 둡니다.
rawDepth에 rawValueToMeters를 곱한 값을 반환합니다.

partial interface XRFrame {
  XRCPUDepthInformation? getDepthInformation(XRView view);
};

getDepthInformation(view) method는 XRFrame에서 호출될 때, 애플리케이션이 frame에 관련된 CPU depth information을 얻기를 원한다는 것을 나타냅니다.

getDepthInformation(view) method가 XRFrame frame에서 XRView view와 함께 호출되면, 사용자 에이전트는 다음 단계를 실행하여 CPU depth information을 얻어야 합니다:

session을 frame의 session으로 둡니다.
depth-sensing feature descriptor가 session의 XR device의 session mode용 list of enabled features에 contained되어 있지 않으면, NotSupportedError를 throw하고 이 단계를 중단합니다.
frame의 active boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame의 animationFrame boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame이 view의 frame과 일치하지 않으면, InvalidStateError를 throw하고 이 단계를 중단합니다.
session의 depthUsage가 "cpu-optimized"가 아니면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depthInformation을 frame 및 view가 주어졌을 때 CPU depth information instance를 생성한 결과로 둡니다.
depthInformation을 반환합니다.

XRFrame frame 및 XRView view가 주어졌을 때 CPU depth information instance를 생성하기 위해, 사용자 에이전트는 다음 단계를 실행해야 합니다:

result를 XRCPUDepthInformation의 새 instance로 둡니다.
time을 frame의 time으로 둡니다.
session을 frame의 session으로 둡니다.
device를 session의 XR device로 둡니다.
depthActive가 false이면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation을 지정된 view에 대해 time 시점에 유효한 depth information을 얻기 위해 device에 query한 결과로 둡니다. 이때 session의 depthType, depthUsage, 및 depthDataFormat을 고려합니다.
nativeDepthInformation이 null이면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation 안에 있는 depth buffer가 depth data에 대한 access를 block하기 위한 사용자 에이전트의 criteria를 충족하면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation 안에 있는 depth buffer가 depth buffer에서 이용 가능한 information 양을 limit하기 위한 사용자 에이전트의 criteria를 충족하면, 그에 따라 depth buffer를 조정합니다.
result의 width를 nativeDepthInformation에서 반환된 depth buffer의 width로 초기화합니다.
result의 height를 nativeDepthInformation에서 반환된 depth buffer의 height로 초기화합니다.
result의 normDepthBufferFromNormView를 nativeDepthInformation의 depth coordinates transformation matrix에 기반한 새 XRRigidTransform으로 초기화합니다.
result의 data를 nativeDepthInformation에서 반환된 raw depth buffer로 초기화합니다.
result의 view를 view로 초기화합니다.
result의 transform을 view의 reference space 안에서 time 시점의 sensor의 pose로 초기화합니다.
result를 반환합니다.

다음 코드는 XRFrameRequestCallback 안에서 depth data를 얻을 수 있는 방법을 보여줍니다. Depth sensing이 활성화되어 있고, usage가 "cpu-optimized"로, data format이 "luminance-alpha"로 설정된 session이라고 가정합니다:

const session = ...;          // Session created with depth sensing enabled.
const referenceSpace = ...;   // Reference space created from the session.

function requestAnimationFrameCallback(t, frame) {
  session.requestAnimationFrame(requestAnimationFrameCallback);

  const pose = frame.getViewerPose(referenceSpace);
  if (pose) {
    for (const view of pose.views) {
      const depthInformation = frame.getDepthInformation(view);
      if (depthInformation) {
        useCpuDepthInformation(view, depthInformation);
      }
    }
  }
}

XRCPUDepthInformation을 얻고 나면, 이는 view plane에서 사용자 환경까지의 거리를 알아내는 데 사용할 수 있습니다(자세한 내용은 § 4 결과 해석 섹션 참조). 아래 코드는 normalized view coordinates (0.25, 0.75)에서 depth를 얻는 방법을 보여줍니다:

function useCpuDepthInformation(view, depthInformation) {
  const depthInMeters = depthInformation.getDepthInMeters(0.25, 0.75);
  console.log("Depth at normalized view coordinates (0.25, 0.75) is:",
    depthInMeters);
}

3.3. XRWebGLDepthInformation

[Exposed=Window]
interface XRWebGLDepthInformation : XRDepthInformation {
  [SameObject] readonly attribute WebGLTexture texture;

  readonly attribute XRTextureType textureType;
  readonly attribute unsigned long? imageIndex;
};

texture attribute는 opaque texture로서 depth buffer information을 포함합니다. 각 texel은 sensor의 near plane에서 users' environment까지의 거리에 대응하며, 단위는 지정되지 않습니다. 각 data entry의 크기와 type은 depthDataFormat에 의해 결정됩니다. 값은 rawValueToMeters를 곱하여 unspecified units에서 meters로 변환할 수 있습니다. normDepthBufferFromNormView는 normalized view coordinates에서 depth buffer의 coordinate system으로 transform하는 데 사용할 수 있습니다. 접근 시, XRDepthInformation의 depth buffer에 접근하는 알고리즘을 실행해야 합니다.

textureType attribute는 texture가 TEXTURE_2D 또는 TEXTURE_2D_ARRAY type인지 설명합니다.

imageIndex attribute는 texture array 안의 offset을 반환합니다. 이는 textureType이 TEXTURE_2D_ARRAY와 같을 때 정의되어야 하며, TEXTURE_2D이면 undefined여야 합니다.

partial interface XRWebGLBinding {
  XRWebGLDepthInformation? getDepthInformation(XRView view);
};

getDepthInformation(view) method는 XRWebGLBinding에서 호출될 때, 애플리케이션이 frame에 관련된 WebGL depth information을 얻기를 원한다는 것을 나타냅니다.

getDepthInformation(view) method가 XRWebGLBinding binding에서 XRView view와 함께 호출되면, 사용자 에이전트는 다음 단계를 실행하여 WebGL depth information을 얻어야 합니다:

session을 binding의 session으로 둡니다.
frame을 view의 frame으로 둡니다.
session이 frame의 session과 일치하지 않으면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depth-sensing feature descriptor가 session의 XR device의 session mode용 list of enabled features에 contained되어 있지 않으면, NotSupportedError를 throw하고 이 단계를 중단합니다.
session의 depthUsage가 "gpu-optimized"가 아니면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame의 active boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
frame의 animationFrame boolean이 false이면, InvalidStateError를 throw하고 이 단계를 중단합니다.
depthInformation을 frame 및 view가 주어졌을 때 WebGL depth information instance를 생성한 결과로 둡니다.
depthInformation을 반환합니다.

XRFrame frame 및 XRView view가 주어졌을 때 WebGL depth information instance를 생성하기 위해, 사용자 에이전트는 다음 단계를 실행해야 합니다:

result를 XRWebGLDepthInformation의 새 instance로 둡니다.
time을 다음과 같이 초기화합니다:

XRSession이 matchDepthView가 true로 설정된 상태로 생성된 경우:
time을 frame의 time으로 둡니다.
그렇지 않은 경우
time을 device가 depth information을 capture한 time으로 둡니다.
session을 frame의 session으로 둡니다.
device를 session의 XR device로 둡니다.
depthActive가 false이면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation을 지정된 view에 대해 time 시점에 유효한 depth information을 얻기 위해 device의 native depth sensing에 query한 결과로 둡니다. 이때 session의 depthType, depthUsage, 및 depthDataFormat을 고려합니다.
nativeDepthInformation이 null이면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation 안에 있는 depth buffer가 depth data에 대한 access를 block하기 위한 사용자 에이전트의 criteria를 충족하면, null을 반환하고 이 단계를 중단합니다.
nativeDepthInformation 안에 있는 depth buffer가 depth buffer에서 이용 가능한 information 양을 limit하기 위한 사용자 에이전트의 criteria를 충족하면, 그에 따라 depth buffer를 조정합니다.
result의 width를 nativeDepthInformation에서 반환된 depth buffer의 width로 초기화합니다.
result의 height를 nativeDepthInformation에서 반환된 depth buffer의 height로 초기화합니다.
result의 normDepthBufferFromNormView를 nativeDepthInformation의 depth coordinates transformation matrix에 기반한 새 XRRigidTransform으로 초기화합니다.
result의 texture를 nativeDepthInformation에서 반환된 depth buffer를 포함하는 opaque texture로 초기화합니다.
result의 view를 view로 초기화합니다.
result의 transform을 view의 reference space 안에서 time 시점의 sensor의 pose로 초기화합니다.
result의 textureType을 다음과 같이 초기화합니다:

result의 texture가 texture-array의 textureType으로 생성된 경우:
result의 textureType을 "texture-array"로 초기화합니다.
그렇지 않은 경우
result의 textureType을 "texture"로 초기화합니다.
result의 imageIndex를 다음과 같이 초기화합니다:

textureType이 texture인 경우
result의 imageIndex를 null로 초기화합니다.
그렇지 않고 view의 eye가 "right"인 경우
result의 imageIndex를 1로 초기화합니다.
그렇지 않은 경우
result의 imageIndex를 0으로 초기화합니다.
result를 반환합니다.

다음 코드는 XRFrameRequestCallback 안에서 depth data를 얻을 수 있는 방법을 보여줍니다. Depth sensing이 활성화되어 있고, usage가 "gpu-optimized"로, data format이 "luminance-alpha"로 설정된 session이라고 가정합니다:

const session = ...;          // Session created with depth sensing enabled.
const referenceSpace = ...;   // Reference space created from the session.
const glBinding = ...;        // XRWebGLBinding created from the session.

function requestAnimationFrameCallback(t, frame) {
  session.requestAnimationFrame(requestAnimationFrameCallback);

  const pose = frame.getViewerPose(referenceSpace);
  if (pose) {
    for (const view of pose.views) {
      const depthInformation = glBinding.getDepthInformation(view);
      if (depthInformation) {
        useGpuDepthInformation(view, depthInformation);
      }
    }
  }
}

XRWebGLDepthInformation을 얻고 나면, 이는 view plane에서 사용자 환경까지의 거리를 알아내는 데 사용할 수 있습니다(자세한 내용은 § 4 결과 해석 섹션 참조). 아래 코드는 data를 shader로 전달하는 방법을 보여줍니다:

const gl = ...;             // GL context to use.
const shaderProgram = ...;  // Linked WebGLProgram.
const programInfo = {
  uniformLocations: {
    depthTexture: gl.getUniformLocation(shaderProgram, 'uDepthTexture'),
    uvTransform: gl.getUniformLocation(shaderProgram, 'uUvTransform'),
    rawValueToMeters: gl.getUniformLocation(shaderProgram, 'uRawValueToMeters'),
  }
};

function useGpuDepthInformation(view, depthInformation) {
  // ...

  gl.bindTexture(gl.TEXTURE_2D, depthInformation.texture);
  gl.activeTexture(gl.TEXTURE0);
  gl.uniform1i(programInfo.uniformLocations.depthTexture, 0);

  gl.uniformMatrix4fv(
    programInfo.uniformLocations.uvTransform, false,
    depthData.normDepthBufferFromNormView.matrix);

  gl.uniform1f(
    programInfo.uniformLocations.rawValueToMeters,
    depthData.rawValueToMeters);

  // ...
}

Depth buffer를 사용하는 fragment shader는 예를 들어 다음과 같을 수 있습니다:

precision mediump float;

uniform sampler2D uDepthTexture;
uniform mat4 uUvTransform;
uniform float uRawValueToMeters;

varying vec2 vTexCoord;

float DepthGetMeters(in sampler2D depth_texture, in vec2 depth_uv) {
  // Depth is packed into the luminance and alpha components of its texture.
  // The texture is a normalized format, storing millimeters.
  vec2 packedDepth = texture2D(depth_texture, depth_uv).ra;
  return dot(packedDepth, vec2(255.0, 256.0 * 255.0)) * uRawValueToMeters;
}

void main(void) {
  vec2 texCoord = (uUvTransform * vec4(vTexCoord.xy, 0, 1)).xy;

  float depthInMeters = DepthGetMeters(uDepthTexture, texCoord);

  gl_FragColor = ...;
}

4. 결과 해석

주어진 pixel이 invalid depth data를 가진 것으로 결정되거나 depth data를 달리 결정할 수 없는 경우, 사용자 에이전트는 depth value 0을 반환해야 합니다.

data 및 texture에 저장된 값은 camera plane에서 real-world-geometry(XR system이 이해한 것)까지의 거리를 나타냅니다. 아래 예에서, point a = (x, y)의 depth value는 point A에서 camera plane까지의 거리에 대응합니다. 구체적으로, depth value는 aA vector의 길이를 나타내지 않습니다.

Depth API data 설명

위 이미지는 다음 코드에 대응합니다:

// depthInfo is of type XRCPUDepthInformation:
const depthInMeters = depthInfo.getDepthInMeters(x, y);

5. Native device 개념

5.1. Native depth sensing

Depth sensing 명세는 depth sensing API가 구현되는 기반 native device가 장치의 native depth sensing capabilities를 query하는 방법을 제공한다고 가정합니다. 장치가 depth buffer data를 얻는 방법을 노출하면, 장치의 native depth sensing capabilities를 query할 수 있다고 말합니다. Depth buffer data는 buffer dimensions, buffer에 저장된 값에 사용되는 단위에 대한 정보, 그리고 normalized view coordinates에서 normalized depth buffer coordinates로 coordinate system change를 수행하는 depth coordinates transformation matrix를 포함해야 합니다. 이 transform은 변환된 3D vector의 z coordinate에 영향을 주지 않아야 합니다. 또한 장치는 projection matrix와 sensor의 transform을 노출하는 어떤 mechanism도 제공해야 합니다.

장치는 2가지 방식으로 depth sensing type을 지원할 수 있습니다. 장치가 최소한의 post-processing으로 추정된 depth value를 단순히 반환하는 경우, "raw" depth type을 지원한다고 말합니다. 장치 또는 runtime이 이 data의 noise를 "smooth" out하기 위해 추가 처리를 적용할 수 있는 경우(예: 동일한 depth value의 더 큰 영역으로), "smooth" depth type을 지원한다고 말합니다.

Note: "Raw" depth data는 종종 confidence value를 동반합니다. UA는 이런 data를 page에 반환할 때, confidence value가 낮은 depth data를 invalid depth data로 취급하도록 선택할 수 있습니다.

장치는 2가지 방식으로 depth sensing usage를 지원할 수 있습니다. 장치가 주로 CPU-accessible memory를 통해 depth data를 반환할 수 있는 경우, "cpu-optimized" usage를 지원한다고 말합니다. 장치가 주로 GPU-accessible memory를 통해 depth data를 반환할 수 있는 경우, "gpu-optimized" usage를 지원한다고 말합니다.

Note: 사용자 에이전트는 두 usage mode를 모두 지원하도록 선택할 수 있습니다(예: 장치가 CPU- 및 GPU-accessible data를 모두 제공할 수 있거나, CPU- 및 GPU-accessible data 간 transfer를 수동으로 수행하는 경우).

장치는 depth sensing usage와 type이 주어졌을 때 다음 방식으로 depth sensing data format을 지원할 수 있습니다. 주어진 depth sensing usage 및 type에서 장치가 16 bit unsigned integer를 포함하는 buffer로 depth data를 반환할 수 있으면, "luminance-alpha" 및 "unsigned-short" data format을 지원한다고 말합니다. 주어진 depth sensing usage 및 type에서 장치가 32 bit floating point 값을 포함하는 buffer로 depth data를 반환할 수 있으면, "float32" data format을 지원한다고 말합니다.

depth sensing configuration은 하나의 XRDepthType, 하나의 XRDepthUsage, 그리고 하나의 XRDepthDataFormat의 조합으로 표현됩니다.

장치가 지정된 configuration에서 depth sensing type을 지원하고, depth sensing usage를 지원하며, depth sensing data format을 지원하면, 장치가 depth sensing configuration을 지원한다고 말합니다.

Note: depth sensing API의 지원은 AR-capable로 분류된 hardware에만 한정되지 않지만, 이 feature는 그러한 장치에서 더 일반적일 것으로 예상됩니다. 적절한 sensor를 포함하거나 다른 technique을 사용하여 depth buffer를 제공하는 VR 장치도 depth sensing API를 구현하는 데 필요한 data를 제공할 수 있어야 합니다.

depthTypeRequest, usagePreference, 및 dataFormatPreference 각각에 대해, 해당 array가 비어 있을 때 사용되어야 하는 preferred native depth sensing capability를 장치가 가지고 있어야 합니다. type, usage 및 format은 장치에서 가장 효율적인 것을 반영하는 것이 좋지만, 서로 의존할 수 있습니다.

장치는 depth sensing active state를 가진다고 말할 수 있으며, 이는 depth sensing capabilities가 active하게 실행 중인지 여부를 나타내는 boolean입니다. 이 state는 true로 시작해야 합니다. 이 state가 false일 때 사용자 에이전트는 이 feature가 활성화됨으로 인한 performance impact를 완화하기 위한 조치를 취하는 것이 좋습니다.

6. 개인정보 보호 및 보안 고려사항

Depth sensing API는 depth buffer format으로 사용자의 환경에 대한 추가 정보를 웹사이트에 제공합니다. 충분히 높은 resolution과 충분히 높은 precision을 가진 depth buffer가 주어지면, 웹사이트는 사용자가 편안하게 느끼는 것보다 더 자세한 정보를 잠재적으로 학습할 수 있습니다. 사용되는 underlying technology에 따라, depth data는 camera image 및 IMU sensors에 기반하여 생성될 수 있습니다.

사용자에 대한 privacy risk를 완화하기 위해, 사용자 에이전트는 session에서 depth sensing API를 활성화하기 전에 user consent를 구해야 합니다. 또한 depth sensing technologies 및 hardware가 개선됨에 따라, 사용자 에이전트는 API를 통해 노출되는 information 양을 제한하거나, 그러한 제한을 도입하는 것이 불가능한 경우 API에서 반환되는 data에 대한 access를 block하는 것을 고려해야 합니다. Information 양을 제한하기 위해, 사용자 에이전트는 예를 들어 resulting depth buffer의 resolution을 줄이거나, depth buffer에 있는 값의 precision을 줄일 수 있습니다(예: quantization을 통해). 이러한 방식으로 data 양을 제한하기로 결정한 사용자 에이전트는 여전히 이 명세를 구현하는 것으로 간주됩니다.

사용자 에이전트가 장치의 cameras가 제공하는 정보와 동등해질 만큼 충분히 자세한 depth buffer를 제공할 수 있는 경우, 먼저 camera access를 얻는 데 필요한 동등한 user consent를 얻어야 합니다.

변경 사항

2021년 8월 31일 최초 공개 작업 초안 이후 변경 사항

7. 감사의 글

다음 개인들은 WebXR Depth Sensing 명세의 설계에 기여했습니다:

WebXR 깊이 감지 모듈

초록

이 문서의 상태

1. 소개

1.1. 용어

2. 초기화

2.1. Feature descriptor

2.2. 의도된 depth type, data usage 및 data format

2.3. Session configuration

3. Depth data 얻기

3.1. XRDepthInformation

3.2. XRCPUDepthInformation

3.3. XRWebGLDepthInformation

4. 결과 해석

5. Native device 개념

5.1. Native depth sensing

6. 개인정보 보호 및 보안 고려사항

변경 사항

2021년 8월 31일 최초 공개 작업 초안 이후 변경 사항

7. 감사의 글

적합성

문서 규약

적합한 알고리즘

색인

이 명세에서 정의하는 용어

참조로 정의되는 용어

참고 문헌

규범적 참고 문헌

정보성 참고 문헌

IDL 색인