What is Kyttar?
Kyttar is a massively parallel asynchronous processor designed for real-time signal processing. It is a 2D grid of 120 identical processing cells with no clock, no flip-flops, and no central controller. Each cell has 32 words of memory, a small ALU, and communicates with its four neighbors through WRITE and JUMP instructions. Data flows through the array the way chemical signals flow through mycelium networks: locally, in parallel, without any central coordination.
The architecture uses 4-phase bundled-data handshaking with Muller C-elements, matched delay chains, and latches instead of flip-flops. Everything is self-timed. There is no clock tree, no clock domain crossing, no CTS. The cells coordinate entirely through local handshake signals.
Why?
The target application is software-defined radio. SDR workloads are inherently parallel: dozens or hundreds of channels need to be demodulated, filtered, equalized, and decoded simultaneously in real time. CPUs process these sequentially. FPGAs can do it in parallel but require hardware design expertise. Kyttar is designed to give you the parallelism of an FPGA with the programmability of a processor. You write signal processing algorithms in simple Q15 fixed-point assembly. No Verilog, no timing constraints, no clock domain crossings. The hardware handles all of that.
A single Kyttar chip can run multiple independent signal processing chains simultaneously on different groups of cells. Demodulators, filters, equalizers, vocoders, decoders, all running at wire speed.
The Architecture
At the heart of every async pipeline stage is a Muller C-element, the foundational building block of asynchronous computing. A C-element outputs high when all inputs are high, low when all inputs are low, and holds its previous state when the inputs disagree:
| A | B | OUT | |---|---|-----| | 0 | 0 | 0 | Both low: output low | 0 | 1 | OUT | Disagree: hold | 1 | 0 | OUT | Disagree: hold | 1 | 1 | 1 | Both high: output high
C-elements control the latches. A request signal propagates forward through a matched delay element to the next stage, and an acknowledge signal comes back. The delay element guarantees the data is stable before the next latch captures it. It looks similar to a synchronous pipeline (sequential elements with combinational logic between them), except instead of a master clock, each stage coordinates only with its immediate neighbors.
The Chip
The first Kyttar chip is a 120-cell array fabricated on the SkyWater SKY130 130nm process through ChipFoundry's ChipIgnite program. The die sits inside an OpenFrame pad ring with 44 GPIOs, power/ground pads, and ESD protection. Not having to deal with the pad ring and ESD is a HUGE win. You can easily sink months into sorting that mess out, so kudos to the ChipFoundry guys for supplying this. Total cost for 100 packaged chips: roughly $15,000. Obviously, this is a test shuttle, so the price per die is very high. On this process node, with volume, production prices will drop by a couple orders of magnitude.
Status: Tapeout May 13th 2026. Packaged parts expected October 2026.
The Flow
The entire RTL-to-GDSII flow is open source:
- Yosys for synthesis
- OpenROAD for place and route
- Magic for custom layout, DRC, and parasitic extraction
- ngspice for SPICE simulation
- Netgen for LVS
- Icarus Verilog for functional simulation
- CVC for SDF annotated gate level simulation
- LibreLane for flow orchestration
- ChipFoundry's cf wrapper for project-level build management
No commercial EDA licenses or six-figure tool subscriptions. All you need is a Linux machine and the willingness to learn. Are these tools as good as the tools from the big 3 tool vendors? No, but they do a pretty good job, especially at this price point. Also, everything is bundled for you in LibreLane, so you just have to deal with configuration files, not the entire tool chain bring up, which by itself is a huge win for a startup.
Ham Radio Applications
I'm a licensed...
Read more »
Chuck McClish