The next step was training it on animorphic lion video. The tool for scaling to 640x640 & labeling the training images is truckcam/label.py. The tool for running tflite_model_maker is truckcam/model_maker2.py
Efficientdet_lite4 is a lot slower to train than efficientdet_lite0. On the lion kingdom's GTX970 3GB, 300 epochs with 1000 images is a 60 hour job. 100 epochs is a 20 hour job.
There's 1 hit for pausing & resuming the training by accessing low level functions. The idea is to save a checkpoint for each epoch & load the last checkpoint during startup. It also shows how to save a tflite file in FP16 format.
Based on the training speed, efficientdet-lite may be the only model lions can afford. Having said that, the current upgradable GPU arrived in July 2017 when driver support for the quadro FX 4400 ended. It was around $130.
Anything sufficient for training a bigger model would be at least $500. This would become the only GPU. The GTX970 would be retired & rep counting would go back to a wireless system. The jetson nano is not useful for rep counting.
3 days later, the 300 epochs with 1000 training images finished.
Converting efficientlion-lite4.tflite to tensorrt
The trick with this is inspector.py takes the last checkpoint files rather than the generated .tflite file. Sadly, inspector.py failed. It's written for an automl derivative of the efficientdet model.
.tflite conversion came up with 1 hit
https://github.com/zhenhuaw-me/tflite2onnx
Doesn't support FP16 input. Most animals are converting INT8 to tensorrt.
Another hit worked. This one went directly from .tflite to .onnx.
https://github.com/onnx/tensorflow-onnx
OPENBLAS_CORETYPE=CORTEXA57 python3 -m tf2onnx.convert --opset 16 --tflite efficientlion-lite4.tflite --output efficientlion-lite4.onnx
/usr/src/tensorrt/bin/trtexec --workspace=1024 --onnx=efficientlion-lite4.onnx --saveEngine=efficientlion-lite4.engine
Failed with Invalid Node - Reshape_2 Attribute not found: allowzero
Another hit said use different opsets. Opset 12-13 threw
This version of TensorRT only supports input K as an initializer.
Another hit said fold constants
OPENBLAS_CORETYPE=CORTEXA57 polygraphy surgeon sanitize efficientlion-lite4.onnx --fold-constants --output efficientlion-lite4.onnx2
Gave the same error.
Opsets 14-18 gave Invalid Node - Reshape_2 Attribute not found: allowzero
A new onnx graphsurgeon script was made in the truckcam directory.
OPENBLAS_CORETYPE=CORTEXA57 python3 fixonnx.py efficientlion-lite4.onnx efficientlion-lite4.onnx2
/usr/src/tensorrt/bin/trtexec --workspace=1024 --onnx=efficientlion-lite4.onnx2 --saveEngine=efficientlion-lite4.engine
Making onnx graphsurgeon insert the missing allowzero attribute made it fail with
This version of TensorRT only supports input K as an initializer.
So obviously opset 13 was already inserting allowzero. Sadly, the input K bug afflicts many animals & seems insurmountable. It's related to the topK operator. It's supposed to take a dynamic K argument, but tensorrt only implemented the K argument as a constant.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.