-
Adding SCPI support
11/03/2025 at 16:42 • 0 commentsSupport for SCPI is the last big feature remaining in this project (save for a lot of cleanup and then, of course, actually testing it in the field).
SCPI is a text-based interfacing standard that instrumentation devices use to communicate with each other. From a syntactical perspective, it's pretty simple: A device exposes a hierarchical set of commands that can be used to either interrogate the state of one or more systems, or to change it.
For example, a programmable power supply may expose a “MEASure” command hierarchy, with a “VOLTage” subsection. If you wanted to know the voltage on channel 1, you could then interrogate the device by sending a command that looks like this:
MEAS:VOLT:1
And the device might respond with something like
2.35
As you can see, commands can be declared with optional suffixes; thus, “MEASure” can be either specified as “MEAS” or “MEASURE” (but nothing in between).
The main challenge working with SCPI is that it is a very loose standard, and, as a result, it is unevenly implemented by different vendors.
Our implementation
In our case, we're going to provide an interpreter generator that allows us to define our own command hierarchy, and an interpreter that can be used at runtime to parse requests sent over USBTMC.
The system uses a set of tries to store the command hierarchy. This is an efficient mechanism that requires relatively little memory and allows for a very simple runtime implementation. The command hierarchy is specified in a YAML file and can be completely arbitrary, so long as it conforms to the SCPI standard.
At runtime, the interpreter accumulates input until it either recognizes a valid command or detects a syntax error. In the case of valid input, it then calls a user-defined handler that can perform the tasks associated with a command.
The interpreter also parses all the parameters and validates them according their types; this makes writing handlers a bit easy, since you don't have to worry about checking for parameter count or making sure that, say, a number is malformed. Of course, you are still responsible for the semantic validation of data.
Integration
Adding the interpreter to an IC application is very simple. The SCPI library is already linked in the overall IC library; therefore, all you need to do is create a YAML file for your definitions, instantiate an interpreter, and override the _onUSBTMCDataReceived() in your application code.
For more information, check out the add-scpi branch, which is now merged into main.
-
Adding USB support
10/31/2025 at 20:27 • 0 commentsThe Pico SDK provides a convenient built-in USB interface implementation that provides both a CDC serial endpoint and a custom vendor endpoint that can be used to reset and program a board without having to enter DFU mode (which, particularly on official Raspberry Pico board, requires unplugging the USB cable and plugging it back in).
Unfortunately, this isn't good enough for measurement and instrumentation devices; although it's possible to use serial-over-USB to communicate between devices or between a device and a controller, the most widely-implemented standard (called VISA), requires the use of a different interface protocol called USBTMC (USB Test and Measurement Class).
Luckily, TinyUSB supports USBTMC out of the box, and therefore we don't have to go through the tedious process of defining our interface. Less luckily, this means that we can't simply use the Pico SDK's built-in USB interface implementation, and we must instead dive deep into the bowels of TinyUSB itself.
A 30-second USB primer
First, a bit about the USB protocol (but just a little bit, because the topic is really complex—this is just a very high-level overview, and I have skipped over a lot of information).
At its basic, USB 2.0 is a half-duplex serial protocol; from a physical viewpoint, there is only a single stream of data, and either the host or the device talk over it at any one point in time. All communication is controller by the host, which tells the device when to listen and when to talk; this ensures that the timing of the transmissions is very predictable, and tends to make devices easier to implement. (It also means that there is no physical out-of-band mechanism for the device to interrupt the host, and interrupts are instead handled through polling.)
Each device can define one or more logical interfaces that each belong to a specific class. Classes, in turn, describe the purpose of the interface: CDC for emulating a serial port, HID for keyboard and mice, TMC for test and measurement instrumentation, and so forth. The vendor class, which the Pico SDK uses for its reset interface, is a catch-all of sorts, and can be used for arbitrary communication with a device whose functionality doesn't fall neatly within any of the pre-defined classes. Using the right class for a device is important, because its nature provides an important clue to the host operating system of how it should work with it, and typically decides which drivers should be loaded and enabled to deal with it.
Devices, in turn, expose one or more endpoints, which describes virtual pipes through which specific kinds of data flow. For example, control endpoints are used to exchange small amounts of data with predictable timing, whereas bulk endpoints are meant to send or receive large amounts of information, but without any time guarantees, and so forth.
Implementing a custom TinyUSB stack
In the Pico SDK, as is the case with many embedded systems, USB functionality is provided by TinyUSB, an open-source library designed to provide end-to-end support for both low- and high-level interaction between hosts and devices. TinyUSB supports a pretty wide range of platforms, and, as a result, tends to be written for the lowest minimum denominator, using only static memory allocation, as well as minimalistic data structures and code structure.
This makes working with it sometimes difficult, especially at the beginning, partly because of the underlying complexity of the USB protocol, with its decades of caked-on incremental functionality layers, and partly because its code is dense and not entirely well commented.
Still, once you get the lay of the land, it's not that hard to build a completely custom stack that does what you need it to. (It is unfortunate, though, that—unless I really missed something obvious—the Pico SDK team didn't think of using a mode modular approach when they built their stack.) And, to be clear, having TinyUSB is much preferable to the alternative, because writing the entire USB stack from scratch would be a pretty harrowing prospect.
Anyway, on to the actual implementation. The first step is to define a series of data structures that “describe” the various interfaces that we intend to support and their endpoints. This is probably one of the hardest part of the project, because you need to understand how the various USB interface classes are defined, and then find how those definitions are actually handled by TinyUSB. Luckily, the library comes with a large number of examples, so the problem is not completely intractable.
Next, we need to intercept a number of callbacks that TinyUSB makes into our code in response to various USB-related events, such as when a message is received or the interface is ready to send data. It's important to note that, because TinyUSB uses statically-allocated memory, it's up to us to split large data chunks into smaller ones that can fit in its buffers.
Finally, we can build our own high-level interface that ties everything together and allows our code to provide the functionality we need to communicate over the various interfaces. Since we use FreeRTOS, we also provide a task whose job it is to continuously call the tud_task() function, which encapsulates TinyUSB's main loop.
Configuration and WebUSB
Since we're building our own stack, we can afford to make it highly configurable. In its current incarnation, the our interface class allows us to change the vendor and product IDs, as well as the vendor and product name strings. “Real” vendor and product IDs can be obtained by becoming a member of the USB Implementers Forum (sadly, at the cost of $5,000), or by applying for a free one from the Raspberry Pi foundation.
Since we provide our own vendor interface, something else that we can add is support for WebUSB's capabilities descriptor. This allows us to specify the URL of a landing page to which compatible browsers (Chrome and Firefox) will automatically direct the user whenever our device is plugged into the USB port of a computer. This can be very handy to direct users to additional information or a Web-based application that can be used in conjunction with our hardware.
You will find the code for our USB interface in the add-usb-support branch. It's baked right into the application template, which means that, under normal circumstances, you won't have to worry about adding it to your code.
-
Going modular
10/29/2025 at 19:02 • 0 commentsThe IC is designed to act as an almost plug-and-play solution to building applications, but so far I've been writing code that is far from modular.
That changes with today's PR, which changes the structure of the project so that each component of the project is encapsulated into a separate library and introduces a new mechanism for easily spinning up a new application using sensible defaults.
Libraries… libraries everywhere
In order to feel familiar to someone who is used to work with the Pico SDK, it makes sense for IC to use similar paradigms. Therefore, I rewrote both the memory wrappers and safety manager so that they can be included directly in a project through the CMake configuration.
The memory wrappers are now part of a library called t76_ic_memory, while the safety manager is now called t76_ic_safety. In both cases, the libraries expose their include directories so that your code will see them as under the t76 directory (e.g.: #include <t76/memory.hpp>), much like the Pico SDK uses virtual paths as a way to namespace its include files.
A template to make things easier
The current application lifecycle is pretty complex, especially when it comes to the safety system; at startup, the library needs to initialize all the safeable components, figure out whether a fault occurred, and only then spin up tasks for both cores.
In addition, the code required to manage FreeRTOS is currently part of the test code that I use to validate the libraries, which means that starting a new project would require a lot of manual work.
To simplify things, I've added a new application base class, called T76::Sys::App, that takes care of most of the boilerplate and makes creating a new firmware project much easier.
This is not quite the final form of the library; eventually, I will rework the code so that the Git repository can simply be added to a project as a submodule, since this will make it much easier to capture any future changes to the IC in downstream code. For now, however, I've kept the current approach of having a test Pico SDK project in the root of the repository, as this makes testing easier.
You can find the latest changes in the modularize branch (and, of course, in the main branch, since I have merged them in).
-
Designing a safety system
10/26/2025 at 13:32 • 0 commentsWhen an unexpected or unrecoverable fault occurs, the worst outcome is for the system to hang. That’s not only a poor user experience; it can also create unsafe conditions. The goal is to ensure that, even in failure, the system transitions into a safe state that minimizes risk to people, property, and the device itself.
The Instrument Core includes a Safety System designed with this in mind. I say “designed” or “attempts” deliberately, because some failure modes lie beyond what software alone can address. For instance, if the RP2350 suffers a hardware fault, it may no longer be capable of running any code at all. True safety, therefore, must extend beyond software into the physical design of the system itself.
Scope of the Safety System
There’s a lot we can do in code. Our Safety System is designed to handle the following situations:
- Unexpected runtime conditions, like memory corruptions, heap and stack overflows, panics, etc., that occur outside of the systems' normal execution flow.
- Hangs and crashes where one core stops running or tasks become starved of CPU time to the point that they no longer function properly.
- Unrecoverable errors triggered when the system encounters a condition incompatible with proper operation (for example, an external device failing to initialize or ceasing to respond).
In all these cases, faults should be captured by a central handler that immediately places the entire system in a known-safe state, gathers diagnostic information (e.g., fault location, cause, and system state at the time), and then performs a controlled reboot to attempt recovery.
However, that alone isn’t sufficient. If we allow unlimited reboots, a persistent fault could create an infinite restart loop, potentially causing further damage to the device or connected components. To prevent this, the system must track reboot counts and enter a lockout state once a defined limit is reached, requiring a physical power cycle (ideally performed after the underlying fault has been addressed).
Finally, if the system operates normally for a period of time after a reboot long enough to ensure that any two failures are unrelated, the Safety System should reset the reboot counter to avoid accidental lockouts caused by isolated or transient errors.
Trapping system faults
As you can see, there’s more to the Safety System than meets the eye. The Instrument Core begins by trapping various SDK and FreeRTOS calls that signal anomalous conditions, including:
-
vApplicationMallocFailedHook() — triggered on FreeRTOS memory allocation failures. Since all memory allocation is delegated to FreeRTOS, this also covers malloc() and delete.
-
vApplicationStackOverflowHook() — invoked when a stack overflow occurs.
-
isr_hardfault() — called when a hard fault occurs at the Cortex-M33 core level. There are also more specific fault handlers (isr_memmanage(), isr_busfault(), and isr_usagefault()), which fall back to the hard fault handler if not implemented.
-
assert() — both from standard C code and FreeRTOS internals.
-
abort() — the traditional C/C++ function for unrecoverable conditions.
When any of these hooks are triggered, they invoke the Safety System with details about the nature of the fault. The Safety System then captures as much diagnostic data as possible, such as task name, stack contents, heap state, and so on, and stores it in a persistent area of RAM before issuing a controlled reset.
Dealing with hangs and crashes
The RP2350 includes a hardware watchdog that automatically reboots the CPU unless it’s “fed” periodically. The Instrument Core’s Safety System uses this feature to handle cases where either core hangs or enters an infinite loop.
On core 0, a low-priority FreeRTOS task is created whose sole responsibility is to feed the hardware watchdog. If any higher-priority task hangs, or if the system becomes so overloaded that the feeder task can’t run often enough, the watchdog will expire and trigger a reboot.
Since there’s only one hardware watchdog, and core 1 runs outside FreeRTOS, it must feed a virtual watchdog managed by the feeder task on core 0. If core 1 stops responding, the feeder task will detect the missed updates, stop feeding the hardware watchdog, and once again allow a reset to occur.
Application faults and reporting
The safety system provides a macro, T76_ASSERT(), which triggers a fault if a given expression evaluates to false. It functions like a standard assert(), except it’s always active, regardless of whether the build is in debug or release mode.
All fault mechanisms ultimately invoke T76::Sys::Safety::handleFault(), which records diagnostic data in uninitialized RAM, which survives a reboot because it isn’t cleared by the bootloader. The reporting mechanism relies only on statically allocated memory and minimizes stack usage. This increases baseline memory consumption slightly but gives the safety system a fighting chance to operate even if the heap or stack have been corrupted by a rogue process.
In the worst-case scenario—where the reporting mechanism itself fails, resulting in a double fault—either the hard-fault handler or, as a last resort, the hardware watchdog will still restart the system. The downside is that you may lose visibility into the original fault, but the device will at least avoid entering an unsafe state.
System safing
By itself, the safety system knows nothing about the specific hardware or implementation details of the device it runs on—and therefore, it has no inherent knowledge of how to render that device safe either.
To address this, the system defines the T76::Sys::Safety::SafeableComponent interface. Any class that implements this interface must define two methods: activate() and makeSafe(), and must register itself by calling T76::Sys::Safety::registerSafeableComponent() within its constructor.
By convention, components are assumed to start in a safe, inactive state and will not enter operational mode unless activate() is explicitly called. Once active, invoking makeSafe() should return the component to its disabled, safe condition.
The system does not guarantee the order in which components are activated or made safe. To manage dependencies, you can group components under a parent object and delegate activation and safing order from there.
Bringing it all together: The startup process
It’s easy to assume that handling a fault when it occurs is enough to make a system safe. In practice, though, a serious fault can leave the environment in such a corrupted state that even running recovery code may be impossible: the heap might be trashed, the stack exhausted, or critical memory regions overwritten.
While the fault handlers do attempt to render the system safe before rebooting, most of the safety logic actually runs at startup, inside T76::Sys::Safety::safetyInit(). To ensure reliable behavior, you should avoid dynamically allocated components (especially those tied to safety-critical functions) since their existence cannot be guaranteed at all times.
At startup, the logic first forces a call to makeSafe() on all registered components. This “just-in-case” step ensures that nothing remains in an unsafe state. This approach assumes all components are statically allocated, which is generally good practice in embedded systems.
Next, the system checks whether the reboot was triggered by the hardware watchdog. If so, no handler could have run before the reset, so the system records the event as a hardware fault.
Before activation, the system verifies whether the device has If it has, it enters lockout mode to prevent further restarts.
Finally, the Safety System calls activate() on all registered components. If any component fails to activate (i.e., returns false), the system treats that as a new fault and triggers another controlled reboot.
Lockout mode: never miss a fault again
One thing that absolutely drives me nuts about embedded development is how hard it can be to figure out what actually caused a fault. Even with a debugger or serial cable hooked up (which, I’ll admit, I’m usually too lazy to use unless absolutely necessary), you need to be watching the processor at exactly the right moment to catch the failure in action.
This gets even trickier when using USB as the serial interface: TinyUSB typically doesn’t start until fairly late in the boot process, and even then it depends on FreeRTOS being up and running before it can function properly.
To mitigate this, the Safety System includes a minimal diagnostic stack that runs whenever the device enters lockout mode. This stack is intentionally simple: it spawns one task to manage CDC-over-USB and another that continuously outputs a complete log of all recorded faults leading up to the lockout.
It’s not a replacement for a proper debugger, but it’s a huge help when investigating faults after the fact. The main limitation, of course, is that it only runs once the system enters lockout mode, meaning you can’t use it to inspect one-off or transient faults. Those are still recorded internally by the Safety System, though; we'll make accessible once we add SCPI support to the IC.
Limitations
As you can see, there’s a lot happening within the Safety System. But as I mentioned at the start of this post, using it alone is nowhere near enough to make your device truly safe. There are countless reasons why a device’s MPU might never execute a single line of code; that means safe design begins with hardware and must be part of your development philosophy from day one.
Designing systems that are safe by default is well beyond the scope of this project. Hopefully, though, the safety system can help you start your design on the right foot and build a stronger foundation for reliability.
You can find the code for the safety system in the add-safety-mechanisms branch of the code repository.
-
Managing cross-core memory allocation
10/17/2025 at 14:26 • 0 commentsThe Pico SDK’s memory allocation routines are fully reentrant and can normally be used safely from multiple cores without worrying about race conditions. Once FreeRTOS enters the picture, however, things get more complicated.
Delegating memory management to FreeRTOS
FreeRTOS maintains its own heap and allocation routines (pvPortMalloc() and pvPortFree()), offering several heap management strategies. The most common is Heap 4, which minimizes fragmentation to reduce long-term out-of-memory errors.
Typically, the system reserves a fixed heap size at startup (via configTOTAL_HEAP_SIZE), and multitasking code then uses the pvPort* routines to allocate and free memory.
In practice, this creates two separate heaps—one managed by the SDK, the other by FreeRTOS—introducing unnecessary complexity. You must now predict and manage how much memory belongs to each heap, and deal with fragmentation and hard-to-debug crashes that can result from it.
My preferred approach is to delegate all memory management to FreeRTOS by overriding the system’s default allocation functions (malloc, free, new, delete) and redirecting them to their pvPort* equivalents.
This keeps memory handling consistent and lets us take advantage of FreeRTOS’s runtime diagnostics for heap integrity and allocation failure handling.
Configuring the Pico SDK
To implement this, add the following settings to your project’s CMakeLists.txt:
-
SKIP_PICO_MALLOC — Prevents the SDK from defining its own malloc() and free() wrappers, allowing you to supply your own.
-
PICO_CXX_DISABLE_ALLOCATION_OVERRIDES — Disables the SDK’s default C++ new and delete operators, enabling you to override them.
We can then write our custom allocation wrappers, which you will find in lib/sys/memory.cpp.
A bare-metal monkey wrench
In this setup, core 1 runs bare-metal code outside FreeRTOS’s control, which creates a problem: the FreeRTOS heap manager has no awareness of memory operations performed on that core.
For example, pvPortMalloc() guards the heap with a critical section to prevent concurrent access from other FreeRTOS tasks. But since core 1 operates outside the OS, those protections don’t apply. If core 1 modifies the heap while FreeRTOS is active on core 0, the system will eventually crash—often in ways that are difficult to reproduce due to timing interactions between the cores.
Possible solutions
There are probably several solutions to this problem, but I landed on a couple different ones:
- Static allocation on core 1
If core 1 is used exclusively for deterministic, time-critical tasks, it likely doesn’t need dynamic memory. In this case, simply avoid heap use altogether on that core. No race protection is required.
- Delegated allocation via T76_USE_GLOBAL_LOCKS
If this is not acceptable, the T76_USE_GLOBAL_LOCKS macro causes the memory wrappers to spawn a core 0 task that listens for allocation commands on a bare-metal queue. Core 1 then sends its allocation requests to this task, ensuring that all heap access occurs under FreeRTOS supervision.
T76_USE_GLOBAL_LOCKS is primarily designed for circumstances where we want to be able to allocate memory at startup, when the performance hit on core 1 operations is not important (although note that allocations will block on core 1 until the FreeRTOS scheduler starts running). Core 0 performance will remain relatively unaffected, since memory operations performed inside it will just translate into direct calls to pvPort* functions.
You will find these memory additions in this pull request.
-
-
Enabling FreeRTOS on the RP2350
10/16/2025 at 02:38 • 0 commentsAdding FreeRTOS support to IC is relatively simple. First, we add RPi's own branch of FreeRTOS as an external submodule to the repository:
git submodule add https://github.com/raspberrypi/FreeRTOS-Kernel.gitNext, we copy a few files from the FreeRTOS source into the project's main directory; these include a CMake config file, a C include file that provides the actual configuration for FreeRTOS, and a couple small C source files that provide some functionality required by FreeRTOS to run.
We must also define a vApplicationStackOverflowHook() function that is called by FreeRTOS when a task overflows its stack. Right now, our implementation (in freertos/rp2350.c) simply asserts and crashes the core, but in a future iteration we will use it to call our exception handler to give the code an opportunity to place our device in a safe mode before resetting or halting.
FreeRTOS configuration
The stock configuration that comes in the FreeRTOS source sets up the operating system with multi-core support and allows all tasks to run on either core. We, however, want FreeRTOS to only run on core 0 while reserving core 1 for our critical tasks. Therefore, we change the configuration so that FreeRTOS runs in single-core mode:
#define configNUMBER_OF_CORES 1 #define configNUM_CORES configNUMBER_OF_CORES #define configTICK_CORE 0 #define configRUN_MULTIPLE_PRIORITIES 1 /* SMP Related config. */ #define configUSE_CORE_AFFINITY 0 #define configUSE_PASSIVE_IDLE_HOOK 0 #define portSUPPORT_SMP 0Making TinyUSB work
Interestingly, this configuration causes TinyUSB to stop working; I haven't really figured out why this is the case, but my bet is that, by default, the SDK attempts to run the TinyUSB task on core 1 if it detects FreeRTOS; since we do not support SMP, this approach fails.
The fix to this problem is pretty simple: We just provide our own TinyUSB task and call tud_task() directly. Eventually, when we add more substantial USB support, we'll move this code into its own class, but for now it's just a simple function inside main.cpp.
Putting it all together
The new version of the source demonstrates that everything works by spawning three tasks:
- A core 1 task that outputs some text; this demonstrates that code is running on both cores.
- A core 0 task that calls TinyUSB.
- A second core 0 task that does the same, interlacing with the core 1 task; this demonstrates that FreeRTOS is multitasking successfully.
Marco Tabini