• Bring-up

    Ayu11 hours ago 0 comments

    2025-10-24

    A prototype device built on a small breadboard, with a Raspberry Pi Pico development board, a round display, and an audio amplifier module connected to a small speaker.

    A pin apparently had a manufacturing defect (?) that caused it to open right at the IC package, so the pins had to be manually shorted and mapped in an odd way in firmware. But after that was clear, driving the display through DMA'ed SPI, and the audio amplifier through I²S (PIO) was mostly smooth sailing. The core problem, as we have predicted, would be in efficient video decoding, and in turn, the selection of video codecs.

    In the following steps, we will use the animation video for a well-known track, Umiyuri Kaiteitan (ウミユリ海底譚, Tale of the Deep-sea Lily; composed by n-buna, video by Awashima). This animation contains a lot of moving, blurred backgrounds and objects, which render it an ideal sample for quickly profiling codec approaches. (Anecdote: on streaming websites where weekly Vocaloid compilations are released, excerpts from this animation is often taken by the audience as an indicator of video quality.) We further process by scaling down to 240×240 and masking content outside the central circular region (display viewport). Frame rate is kept at 24 fps.

    The first experiment is with the QOI format. QOI is a very simple lossless image codec with a decent compression rate comparable to that of PNG. Applying that to our video frame-by-frame, we get a lossless video encoded at ~15 Mbps. A further downscaling by a half (120×120) yields a much more acceptable 4~5 Mbps:

    Plot of bitrate fluctuations over the entire video's duration. Peaks at 6 Mbps and dips at 0.3 Mbps, averaging to around 4 Mbps.

    Original scale (240×240) is heavy in computation and only able to run at 12 fps, but at half scale, decoding is fast enough to run comfortably within RP2040's default 133 MHz system clock. Combine that with QOA-encoded audio processed with my previous implementation uQOA, we get a first working prototype. Here is a recording of the result:

    We must admit that this is less than ideal. Downscaled video is blurry and still takes a lot of storage (a two-minute video will take 120 MiB), which adds cost and complexity in storage and causes a longer wait time during user uploads.

    A straightforward idea is to optimize or modify QOI. QOI works by encoding each RGB pixel with one of the many shortcuts possible, with dedicated optimization for consecutive identical runs. Profiling shows that much of the time is spent in its 64-element hash table serving as the dictionary for recently-seen pixels, but this is largely a tradeoff between space and time (where we aim to optimize both). Modifying this to work in YUV420 will require more extensive work, yet the outcome (performance in time and space) is not easy to predict.

    A low-hanging-fruit alternative is MJPEG which achieves 1~2 Mbps and, as a rough estimate for now, will be on par with QOI regarding decoding speed (as well as being more flexible and tunable). But as we are already decoding JPEG, why not go for MPEG? Here again, I will be retracing a trodden path.

  • Inception

    Ayu11/27/2025 at 09:16 0 comments

    This started as a birthday gift for a close friend. My envisioned outcome would be a circular little trinket that could play video — similar to a button badge or a bag charm — self-contained, battery-powered, and rechargeable over USB. It should at least support video lasting a few minutes (enough for a music video or a short animation), ideally uploadable through USB at a reasonable speed.

    It seems that someone must have done this before. Indeed, this has been implemented multiple times with ESP32-series microcontrollers (including a kit on Adafruit) as well as the more lightweight RP2040 (Ben's 2023 Supercon badge hack and a follow-up revision). However, ESP32 does not excel in power-efficient sustained-load operation, while Ben's MPEG-1 approach had to make compromises in appearance (either go greyscale or use smaller screens). RP2040's official Popcorn demo plays QVGA smoothly, but compression is rudimentary at ~20 Mbps (~40% compression ratio compared to raw 24-bit RGB). Similar commercial products are listed online with a decent battery life of 10+ hours, but only supports seconds-long animations and are priced at CNY 100 (USD 14) or more.

    None of the existing solutions quite matched what I wanted: several minutes of smooth, colourized video, running for hours on a small battery. How greedy I am >_< And I am atoning for it by suffering my own prophecy, confining myself onto the workbench trying to coalesce with the almighty numen of computation, enmeshed in endless rises and falls of the aetheric force...


    After another round of search for microcontrollers, I decided to retrace the path of RP2040. A fast system clock combined with its versatile PIO block makes a perfect fit for smooth video playback, standing out in its range of complexity, power, and price.

    I already have RP2040 development boards and a spare 1.28" 240×240 display at hand. Audio is less of a concern; an I2S-interfaced MAX98357 block covers all needs.

    A round LCD display in a little plastic box.

    Given practice from previous projects, this setup is more of a comfort zone. Still, unknowns remain — how smooth can we reach? The only way to figure out would be to forge a manifestation.