How to generate a pizza image

To generate an image with the correct number of pizza slices, a two-stage (or twice baking) workflow is used.

Stage 1: An image is generated using a relatively strong ControlNet guide strength with an artificially-generated segmentation map. The result tends to be correctly positioned but less realistic.

Stage 2: A depth map of the stage 1 image and the same segmentation map are used to guide image generation, but with a relatively weak ControlNet guide strength. This allows us to obtain a correctly positioned and highly realistic image.

A similar approach can be used to generate images of other "circular objects." An LLM generates a prompt for a Stable Diffusion model. This prompt, along with a segmentation map, is then used as input to generate the image.


From the hardware aspect, it's just a Kiosk Web App built on top of NixOS and Raspberry Pi Zero 2 W.

Running `cmatrix` with the `foot` terminal emulator.

Hi Hackaday!

TODO

  • Create a case/enclosure for the Raspberry Pi.
  • Generate more appetizing (or at least not-weird) pizza images (using Flux or SDXL).