Close

Hacking the SDK

A project log for Bare metal second core on ESP32

Evaluate ways to run code on one of the ESP32s cores without the scheduler interfering

danielDaniel 10/10/2020 at 16:030 Comments

It's finally hacking time ;)

We try to patch the SDK, so we get control over the APP core, while still letting the SDK handle the initialization. Ideally, our main will behave similar to a FreeRTOS task while running bare metal.

Approach

First, we need to figure out, what the

CONFIG_FREERTOS_UNICORE

definition does. After all, we want to replicate that behavior to some extent. So we search through the SDK code and look for code that is dependent on this definition. We can broadly sort the parts into the following categories:

Since we want to keep things simple, we try to not touch the FreeRTOS part and focus on the other three.

Now, we want everything else to think things are running in singlecore mode to avoid modules trying to create tasks on the APP core and to avoid unnecessary memory allocation by FreeRTOS. So we add our new define, which will 'counteract' the UNICORE define in certain places. I named this:

CONFIG_FREERTOS_BAREMETAL_APP_CPU

Therefore, our sdkconfig.h contains both

#define CONFIG_FREERTOS_UNICORE 1
#define CONFIG_FREERTOS_BAREMETAL_APP_CPU 1 

Now, in order for our define to do something, we add some code to selected files, which will undefine UNICORE if our define is set.

#ifdef CONFIG_FREERTOS_BAREMETAL_APP_CPU
#undef CONFIG_FREERTOS_UNICORE
#endif

What to change?

As I wrote earlier, we apply the patch to any file related to starting the CPU / memory map / interrupt allocation / synchronization code. The definition is only used in a handful of places, so we can easily check every file and see if we assume that part is relevant for us.

Additionally, we need to modify the start code in cpu_start.c to start our main instead of the scheduler.

Doesn't work!

When I tried as described above, the system immediately crashed during the dport access init while starting the second CPU. That was when I realized, a lot of the 'basic' SDK code like interrupt allocation actually depends on FreeRTOS functions (like mutexes) and the portNUM_PROCESSORS definition instead of the UNICORE define. And sadly, portNUM_PROCESSORS is referenced quite often. So instead of continuing that road, I decided to reduce the amount of initialization, so we won't have that problem.

Minimal Hack

Since I determined that getting the Interrupt allocation / Synchronization / Crosscore stuff working is a lot of work, I decided to ignore it for now.

Without that, there are only the following files where I  changed something:

The most important change to out manual approach (the one without SDK hacking) is the change in the memory map definition. This way we get cache for our CPU and can execute from external flash. In cpu_start.c we add our code to undefine the UNICORE definition and modify the start_cpu1_default() function. app_cpu_bare_metal_main() is our own main function that is defined in the user code.

#if !CONFIG_FREERTOS_UNICORE
void start_cpu1_default(void)
{
#ifdef CONFIG_FREERTOS_BAREMETAL_APP_CPU
    esp_cache_err_int_init();
    ESP_EARLY_LOGI(TAG, "Starting bare metal main on APP CPU.");
    app_cpu_bare_metal_main();
#else
    // Wait for FreeRTOS initialization to finish on PRO CPU
    while (port_xSchedulerRunning[0] == 0) {
        ;
    }
...

Limitations

With this approach, we can execute from flash, so we are allowed to call SDK functions. However, we still need to be careful that the functions are not using mutexes / interrupts. DPort access is also limited, since we bypass the mutual access mitigation of the SDK. So whatever function we call, we have to check first, if this does anything 'forbidden'. Also, 'printf' doesn't work, so we have to fall back on the much more basic 'ets_printf' for debug output.

Conclusion

Sadly, it doesn't work as well as initially hoped for. Due to complexity, I didn't get the Interrupt / Mutex stuff to work. However, the other fundamental things work. We have cache for the CPU, so it can execute from external flash and not IRAM only as with the full manual approach. Also, the house-keeping interrupts works, so we detect exceptions and the call stack is no longer limited by the CPU register window.

SDK hacking is certainly less elegant than the fully manual approach, but to have the cache and basic interrupts working is a huge benefit.

EDIT: Don't enable the cache without synchronization working!

While fiddling around with this a bit more, I figured out, that enabling the cache on the second CPU without working synchronization is a very bad idea! Then cache loads are not protected and and even if it worked once, two CPUs attempting to access the SPI flash at the same time WILL cause problems sooner or later. This issue may e fixable by modifying cache_utils.c, but I have not tested this.

Discussions