Training efficientdet-lite4

The next step was training it on animorphic lion video. The tool for scaling to 640x640 & labeling the training images is truckcam/label.py. The tool for running tflite_model_maker is truckcam/model_maker2.py

Efficientdet_lite4 is a lot slower to train than efficientdet_lite0. On the lion kingdom's GTX970 3GB, 300 epochs with 1000 images is a 60 hour job. 100 epochs is a 20 hour job.

There's 1 hit for pausing & resuming the training by accessing low level functions. The idea is to save a checkpoint for each epoch & load the last checkpoint during startup. It also shows how to save a tflite file in FP16 format.

https://stackoverflow.com/questions/69444878/how-to-continue-training-with-checkpoints-using-object-detector-efficientdetlite

Based on the training speed, efficientdet-lite may be the only model lions can afford. Having said that, the current upgradable GPU arrived in July 2017 when driver support for the quadro FX 4400 ended. It was around $130.

Anything sufficient for training a bigger model would be at least $500. This would become the only GPU. The GTX970 would be retired & rep counting would go back to a wireless system. The jetson nano is not useful for rep counting.

3 days later, the 300 epochs with 1000 training images finished.

Converting efficientlion-lite4.tflite to tensorrt

The trick with this is inspector.py takes the last checkpoint files rather than the generated .tflite file. Sadly, inspector.py failed. It's written for an automl derivative of the efficientdet model.

.tflite conversion came up with 1 hit

https://github.com/zhenhuaw-me/tflite2onnx

Doesn't support FP16 input. Most animals are converting INT8 to tensorrt.

Another hit worked. This one went directly from .tflite to .onnx.

https://github.com/onnx/tensorflow-onnx

OPENBLAS_CORETYPE=CORTEXA57 python3 -m tf2onnx.convert --opset 16 --tflite efficientlion-lite4.tflite --output efficientlion-lite4.onnx

/usr/src/tensorrt/bin/trtexec --workspace=1024 --onnx=efficientlion-lite4.onnx --saveEngine=efficientlion-lite4.engine

Failed with Invalid Node - Reshape_2 Attribute not found: allowzero

Another hit said use different opsets. Opset 12-13 threw

This version of TensorRT only supports input K as an initializer.

Another hit said fold constants

OPENBLAS_CORETYPE=CORTEXA57 polygraphy surgeon sanitize efficientlion-lite4.onnx --fold-constants --output efficientlion-lite4.onnx2

Gave the same error.

Opsets 14-18 gave Invalid Node - Reshape_2 Attribute not found: allowzero

A new onnx graphsurgeon script was made in the truckcam directory.

OPENBLAS_CORETYPE=CORTEXA57 python3 fixonnx.py efficientlion-lite4.onnx efficientlion-lite4.onnx2

/usr/src/tensorrt/bin/trtexec --workspace=1024 --onnx=efficientlion-lite4.onnx2 --saveEngine=efficientlion-lite4.engine

Making onnx graphsurgeon insert the missing allowzero attribute made it fail with

This version of TensorRT only supports input K as an initializer.

So obviously opset 13 was already inserting allowzero. Sadly, the input K bug afflicts many animals & seems insurmountable. It's related to the topK operator. It's supposed to take a dynamic K argument, but tensorrt only implemented the K argument as a constant.

Converting tflite to tensorrt

Training automl efficientdet-lite4

Discussions

Become a Hackaday.io Member