Something I realized fairly early on while building KaraokeBot was that the "Karaoke Version" of a given song can deviate a lot from the original.
The most common alteration made is to the "key" of the song.
The companies putting out karaoke versions seem to think that lowering a song's pitch will render it more "singable" (by a human).
They might be right, but this meant that my synthesized vocals had to be shifted around to match what was available at my local Karaoke Bar.
Another common change is that of BPM (beats per minute; the song's tempo).
For example, Daft Punk's "Digital Love" comes in at around 124.662 BPM, but most Google search results (and Karaoke Bars) are happy to round up to 125.
Since my bot's vocals needed to match the corresponding karaoke versions' BPM precisely (lest the two disgracefully desync), I actually ended up making two trips to VENUS Karaoke in the International District - once to rip direct feed audio of a few of their songs (to later analyze in FL Studio and determine their absolute BPMs), and then again to record video/audio of KaraokeBot doing it's thing.
In the end, I think it was worth it :-)
(PS: here's a little snippet of KaraokeBot's Digital Love cover)
There are a few things one must take into consideration when building an animatronic karaoke-singing robot.
Perhaps the most important consideration is that of timing.
if your 'bot has sloppy animations, it'll look less lifelike and the illusion will be lost.
Since I was using a Digital Audio Workstation (DAW) to program speech, I thought I might as well use it to program animations as well. A separate, soundless MIDI channel was created for animation, and I was able to slide "hits" (captured as MIDI NOTE_ON messages) and "releases" (captured as MIDI NOTE_OFF messages) around in real time to create believable facial animations.
A few tips for anyone looking to create a MIDI controlled robot:
Have separate servos to control? Consider writing code that listens to separate MIDI channels (you have 16 of them to work with)
MIDI messages aren't just on and off - they can also carry pitch and velocity (the "pressure" or "strength" of a note). You might consider using these to vary, for example, how wide your robot's eyes and mouth open
With the servo I used, it took about half of a second to rotate from the "mouth fully closed" to "mouth fully open" position. If note offs/ons were placed too rapidly in succession, the mouth wouldn't have time to fully open and close, and would just sort of shudder in place. This meant that slower paced vocals worked better, but more fast paced syllables would just have to be cut a little short
In my video, I tried to make it look like I printed KaraokeBot's body perfectly on my first try.
...Good joke, huh?
In reality, the shell featured in the video was my third attempt.
The first shell had a face hole that was just slightly too small, and while the second shell seemed like a good fit, it split horizontally when I knocked it off my desk and onto the floor (lesson learned - a little infill can go a long way).
Although it's a wildly specific part that I don't see many others finding useful, here's a download link for the final KaraokeBot shell in .STL format.
I've also included the now infamous "yellow piece" that broke at the 1 minute 32 second mark in the video.
You might consider printing one as a reminder of the importance of perseverance.
Or maybe it's about the fragility of the human spirit....