In my first project log, I mentioned that the examples provided with the Vosk library were very easy to work with and worked right from the get-go. This was a no-brainer to use it for this project. After all, if it's not broken, don't fix it.
To the code provided by the Vosk devs, I've added a simple POST request handling along with PyGame based display. The "frontend" shows the assistant "thinking", and once the POST request has been responded to by the language model, it displays the response.
For the time being and as a proof-of-concept, the assistant has only two states: "thinkingFace" and "responseFace". The “thinkingFace” involves eyes moving side to side, which, at least in my mind, mimics someone trying to figure things out, and the “responseFace” displays the text the language model responded with.
One more kink that needs to be ironed out is flipping the display orientation 180 degrees. However, as it turns out, it is not so simple, and all the guides I found on the internet for doing so didn't work with my display.
Project files are available in their respective section if you want to give it a try.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.