I've learned a lot in the last week about Kalman Filters, Discrete Linear Transformations, the Perspective-n-Point problem, and more. The big breakthrough was realizing that you can treat the lighthouse and its sensors as just a really high resolution camera that only sees a few points. And as a bonus, you don't have to deal with the usual computer vision problem of identifying "important" features of an image.
Perspective-n-Point is the problem of solving for the camera position and orientation given n points that the camera is able to see. This is effectively the problem we need to solve. (Technically, we want to know the position of the object relative to the camera, but that's a trivial difference.) There's been lots of research into the problem and a number of algorithms exist. The algorithms seem to fall in two categories: Iterative approaches that improve on an initial approximation, and algorithms that solve for the pose all at once. One of the most efficient strategies is an algorithm called EPnP, or Efficient PnP, which solves the problem in O(n) time. Once you have a good pose, if you can assume that the tracked object only moved a little between observations, it can be appropriate to use the previously calculated pose as the input to an iterative algorithm to get the new pose.
One concern is that the implementations of EPnP (and I suspect the other algorithms as well) work on floating point values, not integers. The Cortex M3 in the arduino Due does not have a floating point unit, and by at least one crude measure, floating point operations take ~40 times longer than integer operations. I'm doubtful that these algorithms would lend themselves to an integer variant of the solution.
And, just to throw a wrench into all of the above goodness, it's worth noting that the Lighthouse technology isn't quite the same as a camera. That's because the two sweeps of the lasers (to detect the horizontal and vertical angles) do not occur at exactly the same time. In fact, they're ~8ms apart. While a simple algorithm may ignore this, an algorithm targeting maximum precision would need to take it into account. For high precision, integrating the values from an inertial measurement unit (IMU) would also be a good idea (just like the Vive controllers and headset). To integrate all of these different measurement updates into a single pose, Kalman filtering appears to be the way to go.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.