In my quest to learn more about programmable logic I picked up one of the very fine Nexys 2 boards from Digilent. After messing around for a bit with some simple CPU designs and text-mode output I thought to myself: "I bet I could get 3D graphics out of this thing!"
So here we are.
First, let's look at the board itself:
- Xilinx Spartan 3E-1200 FPGA (1.2 million gate equivalent)
- Micron 16MB PSRAM
- Intel 16MB Flash
- 50 MHz base clock
- VGA output with 8bpp DAC
- USB 2.0 port
- 4 8-pin I/O ports for peripherals
- ... and a bunch of additional I/O, LEDs and buttons
A pretty decent board, especially for the very low price of $149 ($99 if you're a student or teacher.) And while it works perfectly for CPU designs and I/O it has some... interesting challenges for designing a 3D GPU.
FPGA size and speed are the first things you'll notice, yet these are not the most important issues. The VGA output is fixed at only 8bpp (3 bits red, 3 bits green and 2 bits blue per pixel) limiting output quality somewhat, but this is easily overcome by some basic dithering. Even the memory size of 16MB is plenty large enough; the original Playstation only had 1MB VRAM.
No, the biggest issue is memory bandwidth. For a continuous burst at 80MHz the Micron PSRAM has a peak read or write bandwidth of 160MB/s, which seems nice enough, but the 70ns access time throws a wrench into things for random access, limiting bandwidth to around 28.5MB/s. VGA runs at a pixel clock of (about) 25MHz, so reading or writing a single frame in memory would already use up our bandwidth completely in random access. And that's just one way, we need to do both reading and writing, we'll probably need to write many pixels multiple times, and then there's the textures, triangle data, z-buffers, etc.
In short, this means no framebuffer is possible with this memory unless we throw in some pretty hardcore caching. And lacking a framebuffer means throwing out all conventional rendering techniques...
So, does this mean my hopes building a 3D GPU based on the Nexys 2 are doomed?
Not completely. If conventional rendering techniques don't work, we'll just have to use unconventional ones! Taking inspiration from the legendary "beam-racer", the Atari 2600, QuickSilver will render each scanline just before it's being sent to the screen. This means we only need two scanlines-worth of buffering (one for writing and one for displaying), which can easily fit in the FPGA itself.
In the next post I'll talk a bit more about the basic architecture of QuickSilver, and how this scanline rendering will work in practice.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.