This platform is to explore the Espressif ESP32 architecture for real-time audio processing in the application auf sound synthesis. The hardware design allows for implementation of real-time low-latency audio DSP algorithms on the ESP32. We created this project to benchmark if the ESP32 is a suitable platform for this application and are surprised after all, what the ESP32 is capable of.
Check out our nice sample streaming application firmware "CTAG Strämpler", which allows to play back and modify sounds from SD-card in real-time, just one would do with an sampling synthesizer.
We also include the aspect of internet of things in that Strämpler allows to access the freesound.org sound data base through a rest-api. One can download sounds from freesound, play and tweak them.
The ESP32 is maxed out, employing one core for UI, networking and buffer reload tasks from the SD card, the other core runs the real-time audio thread using the 32-bit FPU, and the ultra-low-power co-processor is responsible for configuring the external CODEC (WM8731) and sampling control and modulation data from the external 12-bit ADC (MCP3208). We use the ESP32 WROVER module with 16MB flash and 4/8MB PSRAM (e.g. for an audio delay effect with 1.5ms stereo and ping-pong delay).
We are surprised what one can press out of the architecture.
The idea is to implement more application and use the hardware as a generic platform for implementing audio synthesis / audio effects algorithms. Form factor, analog IO conditioning and power supply architecture is meant to be used for eurorack modular synthesizers.
PCB is a 4-layer design using KiCad, smallest components are 0603, TSSOPs and a QFN (the USB/serial interface for the ESP32). There separate power supply for the digital and analog audio components to reduce noise in the audio signal.
Did you run into any issues with audio interference from the WiFi?