Project | Mixed Reality Drone Racing

« Back to project details Sort by:

Project Update 3
01/21/2023 at 01:07 • 1 comment

What's New
1. Mixed Reality FPV Display
2. Track Builder
3. Updated GUI displays

I am super excited to get this update out, because this is the first time I have a proper mixed reality FPV display. If you remember from last time, I had 2 separate feeds for the gameview and the fpv livestream - but it was pretty difficult to fly this way and led to a lot of crashes (just watch the end of the last video lol). So for this update, I spent most of my time trying to unify those two video feeds by overlaying the overlay virtual objects onto the livestream. You should definitely watch the demo video so see the proper effect - I have to say it looks pretty cool in my extremely biased opinion :)

Another big new feature with this update was a new Track Builder tool to make it easy to actually design custom race tracks in your room. The main idea is to use the drone and it's position estimation system to place virtual 3D markers in your space, which you can then use as a reference to place your gates and other virtual objects. This made it much easier to place gates in the environment for me and especially helped me make sure I wasn't placing them in physically unreachable spaces (like partially through a wall/bed/table/etc).

Apart from that, I also added a few more GUI displays just to display some relevant information like battery charge and runtime remaining. These primarily were to help with some of the range anxiety feelings I wrote about in my previous update post.

Challenges

1. Getting the video feed into Unity
1. Also fixing the memory leak I introduced in my implementation
2. Overlaying the game objects onto the FPV feed
3. Figuring out how to build interesting race tracks

The first big challenge I faced this time was around getting the video feed into Unity and rendering it. If you remember, this is also where I got stuck last time (see my previous update the the details). The quick tl;dr though is that Unity doesn't support the MJPG video format (which how the video frames were being encoded by my video receiver). So instead of using Unity's WebCamTexture class to get the image data in, I decided to do the simplest thing I could think of which was to decode the video frames outside Unity and then pipe them in over the ROS network. Now, if I was trying to build an optimized low-latency, real-time system this would not be a great choice because I'm potentially introducing a whole slew of non-deterministic buffering delays and such...but for now my goal was just to get something workable. I'm also ignoring the fact that ROS1 doesn't support shared memory unless you're using nodelets (which I'm not), so that means I'm potentially using a ton of unnecessary RAM since every image gets decoded, stored into memory, then copied again into a buffer to be sent over the ROS network (and potentially copied back out when it's received if you're not careful). But, for the sake of getting something working, I decided to take on a large amount of technical debt just to see if it would even work. My implementation was then to setup a very simple video server node that connected to the webcam device, read each frame and sent it over the ROS network in a (compressed) image message. From Unity, I then setup a listener to watch for incoming image messages, and render them on a simple plane texture.

So how bad was this implementation? Well, it actually worked decently well, which surprised me. I was successfully able to access and render the image data in Unity. Additionally, even though I didn't do any actual latency testing this time, but it didn't seem to be introducing a significant amount of lag either. Unfortunately, what I did introduce was a memory leak. It took me a while to notice it, but the game was consuming an increasing amount of memory over time leading to my computer freezing if I was played long enough. Luckily, I was able to track it down to the way I was rendering the images: I was creating a new Unity material every frame, which would never get destroyed and release its memory). It was easy enough to fix by pre-initializing a single material and just updating it each frame instead of creating a new material each time. So I'm chalking that up to my inexperience with Unity, but hey at least I figured it out. And my computer hasn't frozen since (also memory usage is constant over time now) ;)

Now that I had the image data in Unity, my next challenge was figuring out how to overlay it with the virtual game objects in a way that looked visually realistic. The image data coming into Unity was 2D whereas everything else in the game was 3D (like the drone, the camera, gates, etc...). I wanted the image data to be rendered behind everything else in the environment, so I needed a way to bring everything else into the foreground. I brainstormed a few approaches, several of which required directly manipulating pixel values either in Unity (through it's shader) or passing data out to a ROS node first. These were both pretty high-effort approaches, but after thinking about it (and getting some input from my cousin WIlliam, who's way better at Unity then I am), we realized a simpler approach was to render the image on a flat 2D plane fixed in front of the game camera far enough back that the environment objects would appear in front of the plane. In the real world this would be totally unrealistic (basically the equivalent of walking around with an enormous tv screen rigidly fixed to your head like 30ft away), but this worked pretty well, and this ended up being the implementation I went with. The result is what's shown in the demo video above.

What's Next

So with this update, I finally have a working proof-of-concept demonstrating a mixed reality drone racing game using both Unity and ROS. This is maybe the first really meaningful milestone I've hit so far, but this is just the beginning. There are a lot of directions I can take this, but I think the main things I want to focus on are:
1. Improving the position tracking
2. Improving the flight modes
3. 3D printing safety guards for the drone and camera
4. Open Source

One thing that is quite clear from watching the gameplay footage from demo video is that the virtual objects do not appear completely fixed in place like they should. For example, the gates sometimes move and shift around even when the drone is not moving, or don't always get closer when you move towards them. This is directly due to error in the drone's position estimation, which is computed by the drone's flight controller using measurements from a single lighthouse base station and the onboard IMU. The position estimate is primarily influenced by the base station measurements, but these require the drone to have a direct line of sight to the lighthouse and also to be within range of the lighthouse's laser sweeps. Without all these conditions being met, the drone only has it's IMU readings to rely on which are both noisy and subject to large amounts of drift over time. I have a couple approaches in mind that I'd like to explore for my next update. The simplest of these is to just set up a second base station (which is actually the recommended number of base stations to use). Depending on how well that works, I think a second approach would be use visual odometry to estimate motion from the camera feed and augment the drone's position estimate. Personally, this second approach sounds more fun, but we'll see how much a second base station improves things.

Beyond the position estimation, I also want to add more flight modes. For those of you familiar with FPV drones, right now this drone only operates in attitude/position hold mode meaning that it tries to level itself out in the absence of controller inputs. This is easier for me to not crash as I'm continue development (and also because I'm not a great FPV pilot), but it is a very conservative and limiting flight mode. Most experienced FPV pilots tend to start flying in what's called rate/acro mode quite early and basically every FPV pilot is flying in acro mode during a race or even when freestyling. So I definitely want to support this flight mode, because I want FPV pilots to be comfortable flying this drone too.

A related improvement to the flight modes is to design and 3D print a camera mount and some safety guards for the drone to provide some protection during during crashes. This should be pretty straightforward to do but will give me a lot more peace-of-mind when testing out new flight modes.

I also want to open source my project as well - this has been on my to do list for a while, but a lot of the work has been done in Unity, and I'm not a huge fan of the way Unity works with version control. But I'll figure something out and share it in my next update.
So that's it for now, thanks for reading and definitely let me know if you find any of this interesting or have questions/suggestions!
Project Update 2
12/20/2022 at 13:57 • 0 comments
What's New
Picking up from last time, my goal was to get the game to playable state, which if you remember: I had no flight controls and no camera feed. So for this update, my focus was on getting those 2 components implemented for the game.

Adding the flight controls was relatively straightforward. To make things easy, I wanted to use a generic PS3-type controller with some basic inputs setup for forward/lateral/vertical velocities, yaw rate, takeoff/landing and motor kill/reset. Luckily, Unity makes it really easy to support gamepad controllers, so I just setup a few callbacks listening for inputs which I then mapped to velocity commands and published over ROS. I then updated the crazyflie ros node to subscribe to these command messages which are then executed on the drone using the python cflib library. This works pretty seamlessly and feels responsive enough to fly with (after some tuning).

By contrast, adding the FPV video feed was a bit more involved and came with a bunch of questions:
- How to send video from the drone
- How to receive video from the drone
- How to render this video in the game
To record video, I decided to use the smallest, cheapest off-the-shelf camera I could find, which ended up being this WolfWhoop WT03 FPV camera + transmitter on Amazon. It weighs about 5 grams, uses around 25 mW of power (at its lowest power setting) and works off of 3-5V input (so draws around 500mA current). It seemed like a good option because it was pretty light and low-enough on power consumption that it can be powered by the crazyflie's onboard lipo. Additionally, being an analog video transmitter meant it should be relatively low latency.

To receive the video, I needed an analog video receiver. I found the Skydroid 5.8G OTG Receiver on amazon for around $30 which could receive an analog video stream and output it as a standard webcam feed on my linux pc. The output webcam feed produced by this receiver is a sequence of 640x480 frames encoded as MJPEG (which is basically a sequence of frames which are rendered as JPEG images without and temporal/multi-frame compression).

To render this video feed into the game, I was looking for a quick solution and my main approach was to try and capture and render the video feed entirely in Unity using the WebCamTexture class. I found a fair bit of trouble with this approach, so for this demo I chose to just render the video feed outside of Unity using VLC player. I'm not very happy about this solution, but it worked enough to give me a feel for the game's playability.

Challenges and Concerns I had
There were a bunch of challenges in getting to this point:
- Sourcing the right components (camera, receiver, etc...)
- Adding the camera to the drone
- Getting the video feed into Unity
- Understanding the camera's impact on battery life
- Understanding the latency of the camera feed
- Concerns around playability:
Let's go through them:
Sourcing components - this was not very complicated, but did require doing a bit of homework to make sure that the camera met the power/weight constraints for the drone and that the receiver would be compatible with it (as well as my pc). This all seemed to work out though.

Adding the camera to the drone - this was again not super complicated but required some homework (I am also less confident in my soldering/hardware skills than I am in my software skills). I used the crazyflie's prototyping deck to solder in leads to the power supply and used a JST pin header to connect to the camera. To physically attach the camera to the drone, I just used electric tape as a quick-and-dirty solution (but it's on my todo list to build a 3D printed mount).

Getting the video feed into Unity - I feel like this was much more complicated that I was expecting (or than it should have been). As I mentioned in the sections above, I was primarily trying to use Unity to directly capture and render the video feed since I felt that was the quickest way to get a video feed into Unity. I wanted the stream to be rendered in Unity because eventually I want to overlay unity's virtual objects onto the video feed. I knew Unity definitely had support for capturing webcam streams, and I was able to get it working using one of my other web cameras. However, after a fair amount of debugging and testing I finally realized that the webcam interface for the Skydroid OTG receiver only outputs MJPEG-encoded streams (which Unity does not support) whereas most modern webcams use H.264/H.265 encoding (which Unity does support). To me this meant capturing the webcam feed directly in Unity was pretty much ruled out, which is why I settled for just rendering the video with VLC for now. Figuring this out is definitely one of my big goals for the next update though.

Understanding the video feed latency - camera latency is a big concern and is one of the main limiting factors on playability. From my initial flight tests, I could tell that there was a noticeable but not unreasonable amount of latency (the camera felt sufficiently responsive), but I wanted to try and quantify this a bit. As a reference point, I know that most FPV drone racers try to keep their camera latency under 40ms, and that anything over ~200ms becomes distracting and eventually unacceptable. I did some quick tests inspired this blog post, where I basically pointed the camera back at a display of the video feed, put a stopwatch next the display and took screenshots so I could see the real time and the time displayed in the video feed simultaneously. Here's an example of a captured screenshot. I sampled this a few times and saw the latency around the 100-130ms range. In the future, I may try to improve this, but for now I'll just keep an eye on this number.

Understanding the camera's impact on battery life - Another big concern of mine is the battery life of the drone once the camera feed has been added. I noticed myself getting anxious about how much battery life was remaining while flying (sort of like range anxiety). To help understand and quantify this for myself, I wrote a couple scripts to log the voltage and motor speeds during charging and discharging tests (you can see an example of the results in the following spreadsheet). Compared to the nominal 7 minute battery life of the crazyflie, with the fpv camera added, I was seeing around 3 minutes of continuous flight time (with reasonably aggressive motor inputs). This helped me understand what my expectations should be in terms of flight time, but I think I want to add some sort of battery level indicator into the game to help with this as well.

Lastly, playability concerns - My initial thoughts from my test flights brought up a few concerns around responsiveness (see latency above) and battery life anxiety (see battery life above). However, the biggest issue I faced in terms of playability though was the fact that I had 2 separate screens I needed to look at (the rendered game view and the fpv view). In fact, this was the reason I crashed in the flight test demo video above, because I was too focused on the game view and not paying any attention to the fpv view. To me this just emphasized the importance of integrating these 2 views together and made it clear that all the important information should be accessible from just focusing in one place.

What's Next
For next steps, my main focus is on unifying the fpv and rendered game views and removing the need for 2 separate displays. Ideally, I would like to overlay the rendered game objects onto the fpv feed to create mixed reality display, but I'm not sure how easy that will be without trying. I'd also like make some additional quality-of-life/playability improvements including 3D printing a camera mount + bumper guards for the drone and further tuning the input controls. Lastly, I'd like to publish the code for others to use. This might take me some time, but for the next update hopefully I'll have code to share, a parts list and build instructions so you can try this thing out for yourself!
Update 1
08/01/2022 at 13:34 • 0 comments

How does it work / What I've done so far

I put together a quick demo video (linked at the top of the post) just to document the current state of my prototype.

I'm very early in the process, and honestly, I've kind of cheated a bunch just to get something up and running and feel out the concept. Most of what I've done has just been connecting pieces together using off-the-shelf hardware/software. Right now, the prototype basically just proves out the concept of rendering the realtime position of a drone inside of a Unity game and getting all the "piping" set up to get data into the right place. Currently, the information flow is all one-directional from the drone to the PC.

On the hardware-side, I'm using Bitcraze's crazyflie drone with it's lighthouse positioning deck and steamVR's base stations for estimating the drone's 3D position. State estimation is pretty hard, but thanks to all the hardwork done by the crazyflie open source community, this is just kind of works out of the box and in realtime (i.e. one of the big reasons why it kind of feels like cheating lol). Communication between the crazyflie and the PC is done using the crazyflie radio dongle.

On the software-side, I'm using ROS to handle all the intermediate messaging and obviously Unity for the user interface, game logic and visualization.

Challenges I've run into so far

Getting the state estimate data from the crazyflie into Unity was somewhat interesting to figure out. Basically, the crazyflie computes its 6DoF pose (position and orientation) onboard, then transmits this telemetry over radio to the PC. On the PC, I wrote a simple ROS publisher node that listens for these messages and then publishes them onto a ROS network. To get the data into Unity, I'm using Unity's ROS-TCP-Connector package (and ROS-TCP-Endpoint) which essentially just forwards the messages from the ROS network into Unity. Inside Unity, I wrote a simple script tied to a gameobject representing the drone that takes the data, transforms it into Unity's coordinate frame and uses it to set the gameobject's position. Overall, it's just a lot of forwarding of information (with some annoying coordinate frame transforms along the way).

Another important piece of the puzzle (as far as rendering the drone inside a 3D virtual replica of my room) was building the room model and calibrating it to my actual room. I can go into it more detail for sure, but at a high-level I basically just picked a point in my room to be the origin in both the physical and virtual room, put the crazyflie there (aligned with the axes I picked for the origin) used the crazyflie cfclient tool to center the base station position estimates there. My process was pretty rough as a first pass, and it will very likely have to improve, especially as I move in the mixed reality direction and start rendering virtual objects on a live camera feed.

What's next?

Tactically, the next few steps would be to add the FPV view into the game (streaming video data from the drone and rendering it into Unity), which involves more data forwarding (and calibration). In addition, I need to add input controls so you can actually fly the drone. The bigger goals in store would be around building out proper gameplay, integrating in autonomy (and figuring out where it makes sense), and maybe exploring what VR functionality might look like as opposed to just using a flat display on a PC monitor.

Thanks for reading through this whole update! If you made it this far, I would really love to hear any feedback or questions on this or anything else. Most likely, it would help me figure out what some additional next steps would be, and I'd be super interested learn if there are other cool directions I could take this project!

Mixed Reality Drone Racing

Project Update 3

Project Update 2

Update 1