I trained a very small CNN (30k parameters, 4 layers) on the dataset and achieved higher accuracy (96%). It also works better in the wild, even without the Haar cascade preprocessing.
Through TFLite for Microcontrollers, I was able to get it to run on-device. With 8-bit quantization, it runs at ~18 fps, which is pretty good.
The whole train-deploy pipeline is a bit of a mess though, since I used Pytorch:
Pytorch (Lightning) -> ONNX -> TF -> TFLite -> TFLite (Quantized).
Will add another demo soon.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.