-
Final Project
04/19/2014 at 00:28 • 5 commentsFinished a few small things like adding red LEDs that light when the alarm goes off, take a look:
The project is basically done at this point. I'll fill in a few more details on the project page this weekend and then consider it complete.
-
Shiny!
04/15/2014 at 05:38 • 0 commentsFinished the enclosure by covering it in metallic paper. Take a look at the results here:
At a craft store I found 12"x12" sheets of metallic paper on thick cardstock that ended up working great. The thicker size of the material helped to smooth out imperfections in the cardboard. For attaching the paper I used Super 77 spray glue so it's a smooth, permanent finish. The only difficulty was working with the glue. You get one shot to line things up and that's it! Also cutting the curved shapes and edges was a little challenging--a very sharp knife is required. The edges aren't perfect but I'm still very happy with how things turned out.
-
Enclosure Progress
04/14/2014 at 03:23 • 0 commentsHere's an update on the progress I've made building an enclosure for the device:
The enclosure is made out of a couple oatmeal containers attached to a frame of foam board. The front is a piece of mat board (roughly the same thickness as the oatmeal containers) bent and glued to the frame in the tear drop shape. I plan to finish the enclosure by covering everything in metallic tape which should give a nice metallic finish without a lot of prep work.
Once the enclosure is finished I'll mount the hardware inside and the project will be finished! :)
-
Speech Adaptation
04/07/2014 at 03:26 • 0 commentsIn this update I've worked on improving the speech recognition by adapting PocketSphinx's acoustic model to my microphone and voice. I've also added a switch to put the device into a quiet mode where it doesn't sound an alarm or print a ticket when profanity is detected. Take a look at this video:
For the speech model adaptation I followed these steps from the PocketSphinx website: http://cmusphinx.sourceforge.net/wiki/tutorialadapt With the adapted model the profanity detection is a little bit better. Some words still aren't recognized very well--for example it still doesn't recognize 'fuck' that often (it sometimes thinks 'fix' is fuck), but strangely it recognizes 'fucker' very well. That said, I'm pretty happy with where things are at with the speech recognition and keyword spotting right now. I also added a switch attached to the Raspberry Pi GPIO which puts the software into a quiet mode. This is useful for testing the recognition without wasting printer paper or blaring the audio. One issue I'm still trying to figure out is why ALSA sometimes cuts off playback of the alarm audio. The full alarm should say "You are fined one credit for violation of the verbal morality statutes.", however you can sometimes hear it cuts off "statutes" at the very end. I've tried adding ALSA calls to wait until the playback buffer is drained and even padded the audio file with a second of silence at the end but still see it cutting off audio randomly. I plan to look a little more into this, but if I can't resolve it I don't think it's a big deal.
Finally, as a next step I hope to make progress on building the enclosure for the device. Originally I didn't think there would be time for it, but based on how much progress I made I think I can get something together to resemble the real device. My current plan is to use a cardboard tube (like from an oatmeal container), foam board, and bent cardboard to build the enclosure. Metal tape stuck to the outside should be a cheap and easy way to get a metallic finish. I'm not going for perfect film accuracy--just something that is recognizable as the real thing.
-
Printer
03/30/2014 at 02:14 • 0 commentsAdded support for the thermal printer, and swapped to a small amplifier & speaker. Take a look at the video for more information:
In the process of integrating the printer I ported the Adafruit thermal printer Arduino library to POSIX/Linux too if anyone is curious.
The only gotcha integrating the printer is dealing with the fact that both the audio needs to play and the printer needs to print a ticket at the same time. Each task needs some periodic updates from the main program--the audio buffer needs to be kept full of samples, and the printer needs to be told what to print next. Since the Pi is rather limited in CPU resources (one core at 700mhz), I went with a non-blocking I/O approach in a tight loop instead of something like multi-threading or multiple processes. So far things work well and the code isn't too ugly by using some nice C++11 stuff like lambdas.
Next step will be to work on the speech recognition. I'm going to investigate adapting the speech model for my microphone or voice to see if it improves the accuracy. Right now the false positive rate isn't too bad, but some small swear words like 'fuck' or 'damn' are easy for it to misinterpret because they sound like normal parts of speech.
I also plan on adding a switch to put it into a 'quiet' mode where it might flash a light on swear detect, but otherwise not alarm.
Also starting to think a little more about getting it put into a case that looks like the real prop. Luckily the prop isn't that complex--it's really just a cylinder with curved box on one end. Looking at stuff around the house I see an oatmeal container is just about the perfect size cylinder to fit the printer. Thin cardboard wouldn't be too difficult to bend into the box shape, and aluminum tape covering the whole thing would give it a metallic look. More investigation into that later.
-
Raspberry Pi Support
03/27/2014 at 02:56 • 0 commentsPorted everything over to the Raspberry Pi and it works great, check out the video:
The only gotcha was that Raspbian ships with USB audio configured off, but after a small config tweak it worked just like on the PC. Very happy to see there aren't any performance issues and it seems to handle processing in real time without issue on the 700mhz Pi.
Next step is to integrate the thermal printer to print violations, and switch to a small amplifier & speaker. On the software side I need to look at tweaking PocketSphinx to get better keyword spotting accuracy.
If time allows I might even start thinking about trying to get everything into an enclosure that mimics the look of the movie prop. Something made of cardboard covered in aluminum tape would probably be simple enough to capture the look of the prop.
-
Coding Progress
03/26/2014 at 08:55 • 0 commentsIn the past couple days I've sorted out how to use ALSA and now have basic audio output working. The basic skeleton of the app is in place too. Right now it just listens on a mic, runs PocketSphinx's keyword spotting, and plays a sound when a keyword is detected.
If you're curious you can find the code on github here: https://github.com/tdicola/DemoManMonitor This is still very much in development and not really ready for anyone to consume. I've tried to make the components somewhat loosely coupled so it wouldn't be difficult to add support for other audio sources, sinks, or even speech rec engines in the future.
For next steps I plan to get the code working on the Raspberry Pi to sort out any issues or performance problems there as early as possible. Will also order a small thermal printer and other small things to get started on the hardware soon.
Once a complete hardware & software prototype is working I want to come back to improve the speech rec/keyword spotting accuracy. Right now with no special training or adaptation for my voice it can pick up some words very well (like 'bullshit') but others it totally misses (I have yet to see it pick up 'fuck' correctly from my speech).
-
Early Prototype
03/21/2014 at 07:47 • 0 commentsHacked together a quick and dirty prototype using PocketSphinx and alsa. You can see a quick video of it here (it goes without saying there will be profanity in the videos I post):
For a first effort I'm pretty happy. There are a ton of options for tuning & training the speech recognition so hopefully I can increase the accuracy.
While putting together the prototype I hit a few issues and dead ends, like the PS3 eye cam mic not playing well with pulseaudio, alsa's python bindings not working at all for some reason, and gstreamer looking way too complex to try using. In the end I'm going to keep it simple and just use C++ with alsa and PocketSphinx. Hoping to clean up the code into something presentable, put it on github, and keep iterating on it.
-
Promising Lead
03/19/2014 at 08:24 • 0 commentsFound a great working example of keyword spotting with the in development version of PocketSphinx: http://syl22-00.github.io/pocketsphinx.js/live-demo-kws.html Some of the words don't work well, but others like 'OK Google' seem to work very well.
This demo is from a Javascript port of PocketSphinx and unfortunately is limited to searching for only one keyword at a time. However digging into PocketSphinx's source a bit, it seems the normal C library can be given a file of keywords. More investigation is necessary (unfortunately the keyword spotting stuff is only in the subversion trunk and not yet documented), but it's good to see a working demo to know what's possible.
-
Project Start
03/18/2014 at 09:37 • 0 commentsGoals:
- Replicate functionality of the 'verbal morality statute monitor'/swear detector from Demolition Man.
- Detect when a swear word is uttered and sound a warning bell / flash lights / print out violation ticket.
- Replicating the look of the device is not a primary goal. Given the time constraint, my (lack of) knowledge replicating props, and other risks it's not feasible to replicate the look of the device.
Current Plan:
- Software:
Use continuous speech recognition library with keyword spotting to detect swear words. Great summary of options here: http://raspberrypi.stackexchange.com/questions/10384/speech-processing-on-the-raspberry-pi I briefly experimented with Pocket Sphinx and found somewhat unsatisfactory results because it is not optimized for keyword spotting out of the box. The biggest challenge and risk in this project will be getting a satisfactory keyword spotting algorithm to work.
Some things to follow up on here are:
- http://www.quora.com/Speech-Recognition/What-is-the-best-SDK-for-KeyWord-Spotting
- http://sourceforge.net/p/cmusphinx/discussion/sphinx4/thread/69cbc4eb/?limit=25
- Hardware Platform:
Raspberry Pi or Beaglebone Black are available and should have the power to do continuous speech recognition (based on googling around for speech recognition projects on each platform). Leaning towards using the model B Pi because it has multiple USB ports and an audio out on board.
- Microphone:
PS3 eye camera's microphone. In my testing this device has a good microphone that can pick up audio from a distance reasonably well. Getting it to work with Linux is mostly straightforward: http://renatocunha.com/blog/2012/04/playstation-eye-audio-linux/
- Audio Output:
Nothing fancy is needed here--just need to play a few audio samples like the buzzer and warning message. The audio output on the Pi should be sufficient when sent to a small amplified speaker.
- Printer
Haven't thought much about this yet, but expect a small receipt/thermal printer should be sufficient for printing violations. More info to check out later: http://learn.adafruit.com/internet-of-things-printer
Next steps:
- Install speech recognition libraries and do serious investigation into which can do keyword spotting reasonably well.