I think this is the best answer. The R1 is directly connected to the device's sensors and can write out state to the shared main memory. Processing the sensor state in 12ms gives 4ms to consume the world state and draw a video frame and hit a solid 60fps.
Any lag from the position of your head/body and the eye display is going to mess with your proprioception. The worse that lag the more likely you are to get motion sickness.
60hz is great while looking at a screen but may not be enough for keeping a good illusion going while you move around space. hololens has a similar setup that renders triangles at whatever rate it can but updates the position on the display at 240hz (that is, even if the next frame of an animation is lagging, its position in space will be adjusted as you move)
imo the illusion was rock solid, extremely challenging given that the display was transparent so you had to keep up with the real backdrop moving - vision pro and all passthrough devices get to fake it but at the cost of proprioception as you said
The HoloLens was amazing and underrated among the wider internet. Like a lot of Microsoft stuff, it's regarded as a joke until Apple copies it, then suddenly everyone takes it seriously.
Nope, still a joke. Like many MS products individual pieces of tech may be incredible but the overall experience is severely lacking. So while you might say "this one aspect of the display technology is incredible!" everyone who's actually used it will reply, "yeah and if it didn't feel like looking through a postage stamp then it would have been great!"
Those 12ms affect the latency, not the framerate. The thing will definitely not render at just 60Hz as that's too low for VR, the standard is usually 90Hz or 120Hz.
If you divide 1 second by 60, you get 16ms. So to hit 60 Hz, you need to complete all processing and draw the frame within 16ms. For 120Hz, like you're claiming, all processing needs to be completed in half the time, or 8ms. And yet, Apple says the R1 takes 12 ms to process input from the sensors? You can draw your own conclusions.
You forget that the processing doesn't have to finish within the same frame. Latency is not throughput.
Not even the most expensive high-end gaming setups can finish the entire input-to-screen processing within just one frame, and yet they can easily render some games at 500Hz or more.
Nothing about end-to-end latency of the R1 tells you anything about how pipelined it might be. It very well could have multiple frames in-flight at the same time in different stages of processing.
To provide a comfortable experience the frame pipeline can't be very deep. The older the frame state compared to the wearer's current proprioception the more likely they are to experience discomfort or outright motion sickness.
That's why I assume the R1 is trying to provide the render pipeline with "current" positional state and then tries to finish the drawing in the remaining 4ms (for 60fps) then the display is only going to lag the wearer's perception by 16ms which is less likely to cause discomfort.
This could be mitigated more if the objects in the VR scene are tagged with motion vectors. If the R1 state update doesn't land in time the renderer can interpolate the "current" position by applying those motion vectors to objects in the scene.
> and can write out state to the shared main memory
I think it would actually make more sense for the M1 to treat the R1 as a display that it writes final composited frames to, then the R1 integrates the output from the M1 into the rest of the scene it’s rendering from the other sensors. IE, the output of the M1 is essentially another camera input to the R1 (well, camera plus multi-channel audio).
Any lag from the position of your head/body and the eye display is going to mess with your proprioception. The worse that lag the more likely you are to get motion sickness.