I started wondering how much this was going to cost me, and decided to try to build it with what I already have - at least for a first prototype. If funding should somehow magically appear, I can consider the ideal components. Admittedly, my junk pile may be larger than most, but I think it's a good exercise to keep the costs down at first. In this log, I'm going to consider the following:
- Projector selection
- Camera selection
- Computer selection
- Thermal imager selection
- Vision Libraries
- Voice recognition
- Speech synthesis
- Instrumentation Interface
- Licensing
I think you could probably put together a minimal system for $200 if you had to buy everything.
I have some initial ideas about the computer vision algorithms, but I'll write those up in a separate log.
Projector Selection
The projector is really the limiting factor in this system. High-resolution projectors are expensive.
I have two projectors that I can try. One is the standard office type with incandescent bulb and a 1024x768 resolution. It's very old and represents a previous technology. More interesting is one I bought recently for about $80.
You can check it out at Amazon here. The resolution is only 800x480, which seems to be very common in inexpensive LED projectors. There are many variations on these on the market - and almost all of them say "1080p", which simply means they'll accept a 1080p signal and down-sample to 800x480 for display. The price of this class of projectors seems to range from around $55 on the low end up to maybe $120, with no solid way to tell what you're getting as far as I can tell. They're all over-hyped and up-spec'd, but at least they seem to work.
What limitations does the 800x480 resolution imply? If you want 1mm pixels on the bench, then the bench area is limited to 80x48 cm. This seems like a generous working area. If instead, you want a higher resolution of 0.5mm pixels, you now get a 40x24cm bench-top This is probably as small as you want to go, but the extra resolution might be useful for the you-pick-and-place mode.
For other applications, this resolution is less than ideal. In one mode, I envision projecting the display from an oscilloscope on the bench-top. This is easier than it might sound with modern instruments. For example, the #Driverless Rigol DS1054Z screen capture over LAN project shows how you can capture screenshots from the Rigol DS1054Z scope through the LAN port. These could easily be captured and displayed on the desktop, probably with a decent update rate. Unfortunately, the DS1054Z has an 800x480 screen, which would use the entire bench-top with this projector. You might reduce the size of the image - maybe 1/2 scale would still be readable, or capture the waveforms instead of a screenshot and draw them in a smaller area. I will have to experiment with this a bit.
There is an issue with any consumer projector, though. They are designed for larger images, so they won't focus up close and produce images with 1mm (or smaller) pixels. I can think of two solutions - either open the projector and modify it for closer focus, or add an external lens. I've modified the focus range on camera lenses before, and I really don't enjoy it, so the external lens it is.
At one point, I bought a cheap ($8) set of "close-up" filters for a 50mm SLR lens like these:
They let you do macro photography with your normal lens. They're not color corrected or even anti-reflection coated, so the image quality is less than spectacular, but they let you focus on close objects. Since projectors are just cameras in reverse, the lenses will let the projector focus a closer, smaller image. I found the lens set in a box of camera junk this morning, and tried them in front of the projector lens. They work great! With the +4 lens, you can focus all the way down to an image around 20cm wide. The +1 or +2 lenses will probably be most appropriate for the prototype, depending on how high the projector is.
So, the total cost for projector and lenses : $90. I assume whatever I build or print to hold the lens in place is free.
Higher resolution projectors cost more. It looks like 720p versions can be had for $200, while 1080p costs $500 or so. I'll keep my eye out for bargains.
Camera Selection
Cameras are cheap these days, but there are still a few issues. Webcams are the easiest way to go, but they're less than ideal for this system for a few reasons. First, very few (if any) webcams have interchangeable lenses these days. I'd ideally like a camera that used CS-mount lenses like found on security "box cameras," which are inexpensively available in many focal lengths. Instead, I will have to settle for whatever focal length comes on the webcam, and adjust the camera distance instead.
A second issue with modern webcams is auto-focus. At first, this sounds like a good thing. But, from a computer vision point of view, changing the focus of the lens changes the camera model. In other words, if the camera is geometrically calibrated at one focus, then the camera changes its focus, the calibration may no longer be valid. I am going to avoid auto-focus cameras for now. This unfortunately limits the resolution, because higher resolution cameras more easily show focus errors, so manufacturers are more likely to include auto-focus on higher-resolution cameras.
I happened to have two Logitech C270 cameras here for stereo vision experiments. They're $20 each, and support 1280x720 resolution at 30FPS using MJPEG compression. One of them will do for now.
A final problem with webcams is the lack of decent mounting options. All cameras should have a 1/4"-20 standard tripod thread. Period. When I first bought these cameras, I designed this adapter that clips on to them and allows you to mount them properly:
Computer Selection
Although getting the vision algorithms to run in real-time may be a challenge, I've decided to start with a Raspberry Pi 2. It's tempting to go with the 3, but I already have the 2 here, and the 3 isn't really that much faster. Let's call it $50 with the SD card, power supply, case, and all the other crap you need to add to make it usable. I can always fall back to using a desktop, but considering what I was able to do on desktops ten years ago, the Pi will probably do.
Thermal Imager Selection
The FLIR Leptons are awesome, but could easily double the cost of this system. Instead, I'm going to start with a Panasonic Grid-EYE sensor. They run about $20, then you need a PCB to put it on. Let's call it $30.
The resolution is very low (8x8), but it should give a rough indication of the temperature distribution, and could trigger a warning if dangerous temperatures were detected.
I don't have one of these, so I'll have to shell out some cash to get one.
At some point, I can revisit the Lepton decision.
Vision Libraries
I'm going to use OpenCV as the base for the vision code. It has its quirks, but it's a decent platform to build on. Image capture is there already, as well as a lot of the low-level building blocks I'll want to use. Robust estimation of planar homographies is a single function call, for instance. No more screwing around with SVD.
Voice Recognition
I'm envisioning this system as a "heads-down" display. You shouldn't have to look away from what you are working on, except perhaps for a tool change. To facilitate this, I'm going to add voice recognition. It's 2018, and you can talk to computers, so there's no reason I shouldn't be able to add this fairly easily. I did a brief survey of the higher-level toolkits available, and the big selling points seem to be how they emulate Alexa or Siri. That's not interesting to me. The state of AI today is that limited-domain applications perform better. I found the CMUSphinx library for recognition, which a few of the higher-level packages use. I'm going to give that a try.
An example interaction might look like:
"Betty, show the oscilloscope"
"Oscilloscope displayed"
"Betty, decode RS-232"
"Showing decoded RS-232 stream"
"Betty, make me a sandwich"
"Bleep blop. You are a sandwich"
I prefer to speak with female computers. You could switch it to Bob mode if you like.
Speech Synthesis
Betty needs to talk as well as listen. I'm going to start with the CMU flite synthesis package, mostly because I've used it before. It's not great, but it's easy. I'll also be evaluating others as I find them. The demos for MaryTTS sound very nice, but I haven't tried programming with it yet.
Instrumentation Interface
Many instruments come with communication ports these days. Lower-end devices may have RS-232, USB, or LAN connectivity, while others may sport GPIB. For some devices, like the Rigol oscilloscopes and spectrum analyzers, the API over the LAN is very straightforward and well-documented, so considering the popularity of these devices, it might make sense to support them directly.
To support the widest variety of devices, it might make sense to use the Sigrok library.
Licensing
I release everything I do under a permissive license these days. This is at odds with some of the licenses of the libraries I may use, so I won't be distributing binaries. You want it, you build it.
Next Up
Planar homographies for fun and profit!
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.