Close

Log #17: Latency

A project log for lalelu_drums

Gesture controlled percussion based on markerless human-pose estimation with an AI network from a live video.

lars-friedrichLars Friedrich 04/07/2025 at 19:520 Comments

In this log entry, I present a measurement of the latency of the lalelu_drums system. The measured latency covers the full pipeline from the optical input to the camera, AI pose estimation with the movenet network, signalling a MIDI message via the serial port of the backend and translating this MIDI message to an audio signal by a Roland JV1080 sound generator.

The measurement scheme is as follows:
A 1 Hz square wave signal is used to drive two white LEDs in an alternating fashion. Each LED is used to illuminate a printed picture of a person. The two pictures are presented to the camera of lalelu_drums and the pose estimation system is used to track the coordinate of the nose of the person. The system is programmed to output a MIDI "NOTE ON" message whenever the nose moves from the left half of the camera frame to the right half. The MIDI message is transferred to a sound generator. An oscilloscope is used to determine the delay between the switching of the LEDs and the appearance of the audio signal at the output of the sound generator.

The video shows the course of the measurement. It can be seen that the average latency is approximately 20ms. The variation in the measured latency of +/-5ms is expected due to the exposure time of the camera of 10ms.

This measurement does not cover the application of puredata to generate the audio output (see Log #15: puredata). I plan to investigate the latency of the puredata setup in a later log entry.

Added 2025-06-29: Here is a video of the theater setup:

Discussions