Close
0%
0%

1080p on an STM32 Microcontroller

Connecting an LCD monitor to STM32

Public Chat
Similar projects worth following
This is the first iteration of this project. I'll post a detailed description in the next few days.

A monochrome VGA video output on an STM32L4 microcontroller, using DMA and SPI. It is necessary to use the DMA on a low-performance processor to achieve the necessary bitrate.

This is a nearly no-component monochrome VGA output project. It has a relatively low processor demand, too. This is achieved by deferring as much tasks to various peripherals as possible.

Utilized peripherals:
• Timers - for accurate timing and sync signal generation,
• DMA - for constant data flow,
• OCTOSPI - for fast serialization of the pixel data,
• Pripheral Interconnect Matrix - to lower the interrupt usage and for more accurate timing.

If you know a possible industrial use for this project or if you have any further ideas, let me know.
__________
More description coming soon. Stay tuned! ☠👈🏻

Video

1080p...

So far I achieved half-resolution of 1080p. The video signal is 1080p widescreen, HOWEVER, every pixel is doubled - every pixel in the memory is rendered as 2x2 pixels on the screen. True 1080p is theoretically achievable. The RAM is somewhat limited on the STM32L4s but it's still possible to fit. 

The OCTOSPI is capable of outputting the necessary 148 Mb/s data rate at a 74MHz system clock in the double data rate mode. The DMA also should be able to deliver at the necessary data rate of 1 transfer every 16 clock cycles.  

  So... coming soon! ☠👈🏻

Schematic

An absolute minimalist design. Because all the 3 color lines on the VGA connector are connected together, the impedance matching cannot be done well - hence, some shadowing will appear in the image. 

This HW version is only a preliminary solution that I use to verify the viability of this project. A good solution has to have 3 independent drivers - all impedance matched individually to the 75Ω color lines. This can be achieved for example with a cheap logic gate IC from the 7400 family.

However, because there are only 2 colors, slight shadows don't really degrade the image quality.

Functional Block Diagram

Nearly all of the necessary functions are executed automatically by the integrated peripherals. The only software intervention needed is setting up the DMA CH2 for every line.

DMA CH1 is triggered by a hardware timer directly. It writes a single word to the OCTOSPI. This ensures a stable start timing for the OCTOSPI, thus, a stable pixel alignment.

At the same time the DMA CH2 is set up and started, so it will immediately begin to transmit the pixel data to the OCTOSPI. The OCTOSPI data transfer is already running, so, as long as the buffer doesn't run out, the pixel timing and alignment remains correct.

View project log

View all instructions

Enjoy this project?

Share

Discussions

drenehtsral wrote 10/01/2024 at 22:51 point

I did something similar with a PIC32MZ at one point (I had three different tries at it, one using SPI for monochrome, one using a lower resolution 8bpp (R2G2B2X2) framebuffer, and a third with an even lower resolution 16bpp (R5G6B5) framebuffer).

I learned an interesting and hard lesson: If there are any JTAG ports that overlap with your pixel outputs, be sure to disable them lest all hell break loose when you output the "right" sequence.

For the color framebuffer in the higher resolution mode I just set the clock source for both the DMA engine and the GPIO ports to the right multiple (2x IIRC) of the desired pixel clock and set the DMA master to have high priority.  The GPIO ports weren't expected to operate over 50MHz but for an output destined just to be mashed into a R-2R DAC it seemed to work OK.

As usual available frame buffer memory is the real limiting factor.

  Are you sure? yes | no

Gabriel Cséfalvay wrote 10/02/2024 at 09:43 point

That's a good lesson with the JTAG, sounds like a real headache to debug.

  Are you sure? yes | no

retepv wrote 09/30/2024 at 20:44 point

Just something I was thinking of. I don't know if anything of this is even feasible, but wanted to mention it anyway.

It would be extremely tricky to do, but I think it would be possible to store the video image data as runlength-encoded data.

Pixels have an X and a Y component. Normally you would set or clear the corresponding bit in video memory and be done with.

But if you store the data per line (Y-component) runlength encoded, setting a pixel would become something like this:

1. Fetch the runlength-encoded data for the Y line and decode it into a lineair linebuffer.
2. Set the corresponding pixel in the linebuffer.
3. Runlength-encode the linebuffer again and write it back to memory.

Fetching the pixel data for output would be similar:

1. Fetch the runlength-encoded data for a specific line and decode it into a lineair framebuffer.
2. Output the framebuffer to the screen.

You would basically trade CPU cycles for memory locations, but gain the higher resolution. So the graphics would be (quite a bit) slower, but you would gain the high resolution of 1080 actual pixels. I'm not sure if the CPU is fast enough for that.

You will also have to take care of bus contention, which I'm not sure is possible. But the STM32 has memory split into 3: SRAM1, SRAM2, and SRAM3. And it has different internal busses that can connect CPU and peripherals (dma controller) so that they can access their own memory in parallel. With some very careful configuration, I'm wondering if you could do the fetching/writing, the encoding/decoding, and the pushing of the VGA pixel data without having any bus contention.

So. Just some things I was thinking about. I am not an expert with STM32 at all, so I really don't have any idea if this is at all feasible. But it *would* solve the problem of not having enough memory if it were possible. :)

Wish you good luck!

  Are you sure? yes | no

Gabriel Cséfalvay wrote 10/01/2024 at 07:23 point

It's definitely an interesting proposal. 

There is one big question, however: how much memory do we allocate for each line - and how do we make sure it doesn't run out?

Regarding the computation power: it'd be on the edge, but it could work. The buffer could be split into 2 halves into 2 different RAMs, this way the congestion would be eliminated.

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates