Welcome to Project Aether!
If you are reading this, you probably know the struggle of building a baremetal cluster for OpenStack or Kubernetes at home. Stacking standard SBCs leaves you with a cable nightmare, zero Out-of-Band (OOB) management, and fragile power delivery. Project Aether isn't just another carrier board; the goal is to design a legitimate, modular supercomputer in a box, bringing data-center reliability to the homelab.
But before routing a single trace in KiCad, we had to define the physical and power limits of the system. Here is the architectural DNA of the Compute Blade.
### 1. The Power and Thermal Reality: Why 5U?
We are designing a backplane that takes up to 10 hot-swappable blades. Each blade hosts 4 compute nodes (supporting CM4/CM5 or custom ARM/RISC-V modules). That is **40 independent nodes** in a single chassis.
If we calculate the max load of the compute modules, the network switches, and the BMCs, we are looking at a power budget hovering around **1000 Watts**. A 5U form factor is the sweet spot: it gives us the physical volume required to route massive 12V power delivery safely across the backplane, allows the use of standard 120mm/140mm fans (crucial for homelab acoustic comfort), and leaves mechanical room for future PCIe or OCuLink expansion modules.
### 2. The Brain: Pivoting to a RISC-V BMC
To achieve true "Design-to-Cost" efficiency, we made a radical choice for the Blade's Baseboard Management Controller (BMC). We bypassed standard ARM chips and selected the **WCH CH32V307VCT6** RISC-V MCU.
This chip is a hardware hacker's dream for carrier boards:
* **Native Ethernet PHY:** It integrates a 10/100M PHY directly. Wiring the TX/RX directly to the backplane for true OOB management.
* **8x Hardware UARTs:** We use 4 of these to wire directly into the serial consoles of all 4 compute nodes simultaneously. No more PIO state-machine hacks.
* **Staggered Spin-up:** It controls the SG Micro SGM2588 load switches to power up the 4 nodes sequentially, smoothing out the 100W inrush current handled by our Richtek Hot-Swap controller.
* **Telemetry:** It monitors the TI INA226 (I2C) for surgical power metrics and TMP1075 sensors for thermal zones.
### 3. The Network Aggregation
Instead of routing 40 individual gigabit cables, each blade features an on-board **Realtek RTL8372N-CG** L2+ SDN Switch. It aggregates the 4 nodes and outputs a single 10GBASE-KR high-speed link straight through the edge connector to the backplane.
### What's Next?
Let me know in the comments what you think of the CH32V307 choice!
Null Runner
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.