Method
At first, we try to not hack the SDK and to avoid any side effects due to the SDK or any library trying to launch a task on the second core. So we want the running system to be completely unaware of the second core running.
So, to start with, we build everything with
CONFIG_FREERTOS_UNICORE 1
Since I'm using micropyton as a platform, I need to make some further changes to assign everything to the PRO core. If everything is working, you should get a boot message telling you the ESP is running in single core mode.
I (534) cpu_start: Pro cpu up.
I (534) cpu_start: Application information:
I (534) cpu_start: Compile time: Aug 25 2020 15:32:36
I (538) cpu_start: ELF file SHA256: 0000000000000000...
I (543) cpu_start: ESP-IDF: v4.0.1
I (548) cpu_start: Single core mode
I (553) heap_init: Initializing. RAM available for dynamic allocation:
I (560) heap_init: At 3FFAFF10 len 000000F0 (0 KiB): DRAM
I (566) heap_init: At 3FFB6388 len 00001C78 (7 KiB): DRAM
I (572) heap_init: At 3FFB9A20 len 00004108 (16 KiB): DRAM
I (578) heap_init: At 3FFBDB5C len 00000004 (0 KiB): DRAM
I (584) heap_init: At 3FFCA270 len 00015D90 (87 KiB): DRAM
I (590) heap_init: At 3FFE0440 len 0001FBC0 (126 KiB): D/IRAM
I (597) heap_init: At 40078000 len 00008000 (32 KiB): IRAM
I (603) heap_init: At 4009DFE4 len 0000201C (8 KiB): IRAM
I (609) cpu_start: Pro cpu start user code
Looking into the datasheet, we see, the config for the second core is straight-forward through the DPORT_APPCPU_CTRL_* registers.
To launch the CPU, we first ensure, it's not already running by checking the CLKGATE register. If it is disabled, we reset the CPU, load the entry vector and start the CPU by enabling the CLKGATE. We also have to allocate stack for the second core.
if (DPORT_REG_GET_BIT(DPORT_APPCPU_CTRL_B_REG, DPORT_APPCPU_CLKGATE_EN))
{
printf("APP CPU is already running!\n");
return;
}
if (!app_cpu_stack_ptr)
{
app_cpu_stack_ptr = heap_caps_malloc(1024, MALLOC_CAP_DMA);
}
DPORT_REG_SET_BIT(DPORT_APPCPU_CTRL_A_REG, DPORT_APPCPU_RESETTING);
DPORT_REG_CLR_BIT(DPORT_APPCPU_CTRL_A_REG, DPORT_APPCPU_RESETTING);
printf("Start APP CPU at %08X\n", (uint32_t)&app_cpu_init);
ets_set_appcpu_boot_addr((uint32_t)&app_cpu_init);
DPORT_REG_SET_BIT(DPORT_APPCPU_CTRL_B_REG, DPORT_APPCPU_CLKGATE_EN);
According to the Tensilica spec, the first thing to do after start is to reset the 'Window' registers. This is a special feature of this CPUs, the general purpose registers are banked. The banks can be switched with the 'Window' registers. We also need to initialize the stack pointer, which is in the register A1 by convention. We then call our main for the APP CPU. After the main finishes, we let the second core turn off its own clock to halt it.
static void IRAM_ATTR app_cpu_init()
{
// Reset the reg window. This will shift the A* registers around,
// so we must do this in a separate ASM block.
// Otherwise the addresses for the stack pointer and main function will be invalid.
asm volatile ( \
"movi a0, 0\n" \
"wsr a0, WindowStart\n" \
"movi a0, 0\n" \
"wsr a0, WindowBase\n" \
);
// init the stack pointer and jump to main function
asm volatile ( \
"l32i a1, %0, 0\n" \
"callx4 %1\n" \
::"r"(&app_cpu_stack_ptr),"r"(app_cpu_main));
DPORT_REG_CLR_BIT(DPORT_APPCPU_CTRL_B_REG, DPORT_APPCPU_CLKGATE_EN);
}
And that it! We can now have a app_cpu_main() function that runs completely independent of the rest of the system. I verified this works by incrementing a counter in the main function. Every time I start the APP core, I can verify if this counter has been incremented.
Limitations
No cache
The APP CPU cache for external flash access is at the fixed at address 0x40078000, which is part of the allocateable memory as you can see in the boot log. So we must not enable the CPUs cache. Therefore any code running on that core must be run from the IRAM. That's not very nice, since you have to be very careful when calling any functions. If you only plan to run something rather primitive on that core, it shouldn't be a problem. After all, disabling that core freed 32K IRAM in the first place.
No exception handler
That's not a real limitation, since I simply didn't bother to set / write an exception handler. But be aware, that due to the special architecture of the CPUs, the maximum call stack depth is limited by the register window, if you don't have an exception handler. But with the IRAM limitation, a large program on that core is not a good idea anyway.
Conclusion
Yes, it works! But this way is only useful for not-too-complex tasks. Good thing is, no other part of the system knows the core is running so you can be certain this does not have any undesired side-effects.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.