Second Update
Mateo de Mayo
28 February 2021
Summary
While the previous update gave me a bird's eye overview of some of the topics I wanted to learn more about, these two weeks have been more about getting a closer look at the source code of Monado while making a keyboard and mouse driver for it.
Complementing the OpenXR Specification
I started reading the Monado source code without any particular objective in mind but now I’m very glad I did so at this stage. Reading an OpenXR implementation seems to complement nicely the read of the specification as I had knowledge gaps between the need for some OpenXR ideas and their applications in VR. When reading the code that makes up these concepts, I got to know the how, which in turn made the re-read of the specification’s what make a lot more sense.
An example of this are XrSpace
s. Reading the code you will find
a xrt_space_relation
type which has a definition along
the lines of:
struct xrt_space_relation {
struct vec3 position;
struct quat orientation;
struct vec3 linear_velocity;
struct vec3 angular_velocity;
};
There is also a so called xrt_space_graph
container of a sequence of
xrt_space_relation
s which lets you “chain” spaces together. With this context
alone it becomes a lot more clear that the concept of an XrSpace
is trying to
mimic the idea of an inertial frame of reference (though with
rotations) which can be chained together as well. And while
this idea can be extracted from the Spaces
chapter of
the specification, it might be a bit obfuscated with the API details (even
though they are quite important for completeness and for using the spec as
reference once you are doing your project).
Monado Source
To get a picture of the size of Monado, its src/
directory as of today has
about 200 KLOC (with src/external/
having about 80 KLOC). That size means it
is not too overwhelming to get a good understanding of most of its parts and
that you may even read it from the start of one of its
target’s main
functions and be able to follow along.
I started by reading the probe
option of the cli
target and
doing a driver that prints to the console. A good start for writing one is
reading target_lists.c
and the dummy
driver. In the process, you will find
how a form of inheritance is implemented by some structs usage of a base
member. An example of this is how almost all drivers implement a custom struct
that derives from xrt_device
.
It was useful to have some background in the OpenXR specification as many
functions exposed in Monado base structs are directly related to OpenXR ones
like xrt_device->get_tracked_pose
with XrLocateSpace
and
xrt_device->get_view_pose
with XrLocateViews
.
As Monado can work both embedded into an app or as a service hosting multiple
OpenXR apps, there is an IPC protocol implemented. Down one of the
call-stack-rabbit-holes, I found references to functions that supposedly did not
exist in the codebase and later found only existed in the build/
directory. It
was most interesting to find out that this IPC protocol seems to be described in
a json file and that a python script is
responsible for writing C files which in turn are the ones that manage the
protocol. I’ve not seen this kind of metaprogramming in other C projects and I
think it is a clever idea considering how frequently an internal IPC protocol
may change. Though I wonder how this compares to other solutions using things
like gRPC, protobuf, thrift, etc.
Qwerty Driver
I tried to expand my “Hello World” driver into something that did a bit more. As
it seemed to be no easy way to emulate HMDs or controllers with keyboard and
mouse, I thought it could be a good idea to try to do a driver named qwerty
which does that. I forked the repo and started working on it. Here is
a video of the driver in action some days ago:
Some more improvements have been implemented after the video was made, the controllers are now “parented” to the HMD, their poses can be reset, and the movement speed of each device can be changed at runtime with the mouse wheel.
As my original goal was to have an excuse to read more of Monado, the code was
at first not properly documented, and my commits scattered XXX
and TODO
comments over the entire source code as notes for my future self. However, once
I shared it in the Monado Discord I saw that some people might want the feature
implemented. Also, it would be pretty nice to have this driver complement
physical devices as well. I will try to eventually get it merged after I solve
some problems. In fact, a solution to one of the first problems was merged today
and it was my first contribution to Monado!
There were other problems that took some time to solve when first implementing the driver:
-
How to even grab inputs from the keyboard and mouse. Two windows that Monado can open are the debug UI which uses SDL and ImGui and the compositor window which uses Vulkan and XCB. I ended up choosing the debug UI because I thought it would be easier to work with. And indeed, SDL events are very straightforward, however, I don’t like the fact that I’m building over procedures prefixed with
oxr_sdl2_hack_*
(note thehack
in there). -
Figuring out meson dependencies.
-
Things that were not yet implemented in Monado and that I wasn’t sure were my fault. The Discord server was very helpful for that.
-
Dealing directly with quaternions for once. I usually used them through game engine abstractions, and while Monado has some of those abstractions, I had a knowledge debt with myself in that regard. Long story short, I really liked this book. It presents the information in a standard definition-theorem-proof way for math concepts related to graphics. I will take the time to read more of its chapters in the future. Another nice resource, though less “to the point”, are the amazing 3Blue1Brown videos and explorable explanation (see also a list of other explorable explanations if you don’t care losing some hours).
SLAM Introductory Papers
I did some introductory reads of SLAM, in particular, I started with this paper which seemed like a good overview of SLAM with the hopes of understanding some of it. Before diving into the main topics, the paper recommended for non-expert readers, to get a look at these tutorials. And so I ended up reading the first one. Unfortunately, while the problem setup uses basic probability and statistics concepts, the main section, “Solutions to the SLAM Problem” went way over my head, as I don’t have knowledge about Kalman filters nor in the Rao-Blackwell theorem. Even considering this, I got a better understanding of the problem SLAM faces, and how probabilistic approaches seem to be the standard (though I’m also excited to see how well new techniques using deep learning perform at SLAM).
What’s Next
The next two weeks are my last before starting classes again, so I’ll probably use some of this time to finish non-XR stuff I’ve been delaying. Having said so, I would like to move forward with the Qwerty driver and get it merged.
Besides that, I still have a lot to cover, and might pick to do something related to one of these:
-
SLAM: There are two routes I want to explore, one is the theoretical one, learning about Kalman filters would probably be a good start for that. The other one would be to find some premade library and try to implement something with standard datasets.
-
Graphics Theory: The math book mentioned above seems to have many great chapters, I’m very interested in better understanding 3D transformations, and so the Transforms chapter would make a nice read.
-
Graphics APIs: I will forget about Vulkan for now, as everywhere I see, it is recommended to start with OpenGL, and so it may be interesting to follow something like https://learnopengl.com/.
-
VR: At first I did not consider VR as something separated from graphics, but the more I read Monado and XR related articles, the more I understand that the combination of problems facing this medium is very unique, and so it may be good to read something that focuses on those. I initially found this book, but might look for other options as well if I decide to go down this path.