-
And the projects come and go, talking of Michelangelo...
05/17/2018 at 16:14 • 0 commentsSummary
Time to Rise and Resurrect work on this project after languishing a few months. When I left off, I was having a Devil of a time getting eLua to work with absolute stability on my NetDuino/STM32Workbench/FreeRTOS platform. I am taking a step back (for now at least), and trying a straight port of Lua 5.3.4 for now.
Deets
When I left off, I had eLua somewhat working, however it had a tendency to crash all the time. I know it must be something dumb that I am doing, but I really have to say it should be simple to do a platform adaption, and I guess for some reason for me in this case it is not. Again, 'it probably is just me', but I'm tired of reverse-engineering the code to try to guess what is going on. So, I'm taking a step back to the beginning, and porting the Lua 5.3.4.
Ultimately, I wanted 5.3.4, anyway, and eLua (as with other projects like LuaJIT) are stuck in the past with 5.1.x. I'm not sure why, but it is pretty common. I mainly want 5.3 because it has bitwise operators, and why not be on the latest if you're doing new work. The upside is that I have already integrated 5.3 in other (desktop) projects, so I am familiar with it, and the downside is that it does not have the eLua optimizations for memory constrained devices -- the so-called "Lua Tiny RAM" patch. 'So-called' because I don't see a patch file anywhere. Rather there is just the resulting patched source. Anyway, I've decided for the moment that I'll save analysing what that is and make a proper 'patch' to the 5.3 source as a future exercise. For now, I want an unconditionally stable Lua running in FreeRTOS with the STM32Workbench toolchain on the NetDuino board.
So, I made a new branch, 'elua002' (named before I decided to abort elua for now), and transferred over my memory manager and wired in the Lua 5.3.4 source. I modified the 'lua.c' which contains a 'main()' that needs some minor changes so as to be invoked from a FreeRTOS task, and built. I didn't bother trying to run it since I know it is not going to work without more support.
The STM32Workbench toolchain uses newlib-nano. Newlib expects to have a bottom edge ('syscalls and 'stubs') realized that provides platform-specific implementation for things like open(), close(), fork(), kill(), etc. This makes some sense, but to my great vexation, the STM32Workbench build of newlib-nano has some sort of default implementation of those lower-edge functions. What do they do? I can tell you what they don't do: they don't interface with my board's serial ports or filesystem implementation. So I gots some works to dos. But what, precisely?
My first attempt was to #define my self out of reality. There is a master header 'lua.h' that all the lua modules include, and it has a section marked at the end stating 'here be your dragons', so I made a 'redirection' header which #defined all the stdio/stdlib stuff that I thought I would need into my own implementations, cleverly named with a prefix if 'my-'. This was a fairly surgical change, and that aesthetic appealed to me. However, studying the resulting map file showed that some libc things were still getting hoisted in, so I ditched this approach. I probably could have continued analying it and got it working, but it gave me the heebie-jeebies as far as being unconditionally stable in the case of evolving more code. So I gave this up, and went back to bit the bullet by implementing the newlib bottom-edge functions properly. This should be unconditionally stable, though potentially more, or awkward, work.
<soapbox>
- I would have preferred that the vendor (STM?) have not built newlib with the default implementation. This would have cause linker errors that would make it very obvious as to what I needed to implement.
- I would have preferred that '-specs=nosys.specs' linker switch would have indeed suppressed that implementation, again causing linker errors that spell out what I need to implement.
- I would have preferred that the Newlib authors provide a header of some sort that declares all the many methods of the lower-edge that need to be implemented, so that at least I had a checklist, and all the pertinent function signatures.
- I would have preferred that the newlib documentation:
https://sourceware.org/newlib/libc.html#Syscalls
which almost provides such a checklist, albeit not machine-readable, at least had function signatures that were the actual ones needed by the libc implementation.
I would have preferred many things, but we get what we get, and we have to carry on. Sometimes we can leave things better than we found them. Anyway, I collated all the data in the newlib doc mention, and fixed the pertinent function signatures, and included what was required for the definitions, and turned it into a header 'newlibsupport.h'. Now I can at least use this in future projects using newlib.
For my next amazing feat, I produced a 'stub' implementation 'newlibsupport.c' which implements all those methods, but deliberately has that implementation reference a method 'NEEDS_IMPL()' which is specifically not implemented. This lets me first prove upon attempted linkage that the linker is trying to use my implementations instead of the defaults unfortunately baked into newlib, and also have a skeleton implementation with a 'your code here' sign posted.
De gustibus non est disputandum? Perhaps.
</soapbox>
Anyway, that being out of the way I did validate many sensible things need to be implemented; e.g. all the open/close/read/write methods that I was fully expecting, but also seek(?)/stat(?)/link(?)/unlink(?)/exit(?) need to be as well. So I stubbed those for the moment to get a link to succeed, and now I am off to implementing that stuff.
It will be a little bit of a hassle, since I'll have to implement some sort of file descriptor system, with a v-table of the methods that forward into device-specific implementations, and maintain errno (a separate peculiarity of newlib and requiring some FreeRTOS tweaks to support. I won't bore you, but you can google 'freertos newlib support' and find out more. It has to do with maintaining the 'struct _reent' object.) In the end it will probably be worth it, since then libc will work as expected, and if nothing else I can reuse it in other projects based on this toolchain.
"Heigh-ho, heigh-ho, it's off to work we go..." [sic]
Next
Wherein I implement some stubby-stubs to smooth-over the newlib nubby-nubs.
-
Reimplement Realloc Redux
01/18/2018 at 03:03 • 0 commentsSummary:
I have implemented 'realloc' in the FreeRTOS 'heap_4.c' implementation, along with some other debugging features.
Deets:
We left off with my mentioning a way to intercept and redirect function calls at link time (on gcc, using 'wrappers'), but that there was a functional gap in that the FreeRTOS heap implementation does not have a 'realloc' function. I set down to implement 'realloc', and I can report that I have emerged triumphant.
I started with the 'heap_4.c' implementation. In this implementation, the heap is realized as a series of blocks aligned to a defined boundary size (in this case '8 bytes'). The blocks have a short header consisting of a pointer to the next block, and a size of the block itself including header. The size field furthermore uses the most-significant bit to indicate 'allocated' or not. If a block is /not/ allocated, it will be part of a list linked by the header's 'next block' pointer. If it /is/ allocated, the header pointer will be considered invalid (it will actually be NULL-ed), and the size will have the most-significant bit set. So, alloc()'ed blocks are not linked, but free blocks are, and in address order. Free'ing a block will return it to the linked list in the proper place, and merging of adjacent blocks is performed.
I implemented a 'pvPortRealloc' function that defers to the existing pvPortMalloc and pvPortFree for the degenerate cases, and otherwise tries to satisfy a resize request by nibbling away from the next block, if it is free, and if there is enough space. If there isn't, it will try to do a malloc-copy-free sequence. One implementation detail of 'realloc' is that if it fails, it does /not/ free the original block; callers must be aware of that.
I created some unit tests and verified the code to my satisfaction. In the process of satisfying myself, I discovered a couple things:
- the heap is created from a static chunk of RAM named 'ucHeap'. FreeRTOS will defined for you, but there is an option 'configAPPLICATION_ALLOCATED_HEAP' that will suppress that, in which case you are meant to define it yourself. This lets you place the memory in other regions if you like. One thing I noticed is that the default definition is not suitably aligned (8 byte), and consequently four bytes were wasted at the beginning and end. I want my 8 bytes back!
- heap memory is not initialized in malloc, of course. This made visual debugging of the heap a bit tedious, since it was not readily obvious what parts were allocated/freed/written to.
For the fist thing, I used the configAPPLICATION_ALLOCATED_HEAP option and a gcc-specific declarator '__attribute__((aligned(8)))' on my own ucHeap, and was able to get the whole heap aligned correctly. Ostensibly, the linker would also stick some other variables in the space 'recovered' this way to be efficient, though I didn't verify.
For the second thing, I added a new feature whereby the heap memory was filled with distinctive patterns under certain conditions:
- for memory that can not be used (e.g. padding), I filled with 0xdd (for 'dead data')
- for memory that has been freed, I filled with 0xfd (for 'free data')
- for memory that has been allocated, I filled with 0xcd (for 'clear data')
- for memory that has been just initialized, I filled with 0xed (for 'virgin data')
These fill patterns are only done if a switch 'configMALLOC_FILL' is defined to 1, so it's a completely optional debugging feature. Now I can visually see at a glance where blocks are bounded, what is allocated and cleared, and what has never been used at all. It's kind of pretty! It's so much easier to visual inspect the heap in the debugger than it is to manually go through the bytes with a calculator to see what is and isn't used. This leads to the next thing...
Armed with my newfound heap internals knowledge, I implemented another debugging tool: a heap walker. The gist is that you invoke vPortHeapWalk() with a callback function of your implementation, and it is called for each block on the heap, with the block pointer, the size, and a flag indicating if it allocated or freed. So now if you wanted to do a heap walk to find non-freed blocks, you can do that easily. You could also see how much heap overhead is being consumed in header blocks, etc.
Because heap corruption is a real problem when developing code, also I added some checks to detect that. The checks can't be comprehensive, but they will detect the pathological cases which would cause the heapwalk to go gonzo, and most heap abuses would cause that, so it is a pretty good check. The short story is that they check for block header overwrites that cause an insane condition to occur, of which there are presently four:- block pointer is to non-aligned block
- block pointer is to an area outside of the heap memory chunk
- sequence of blocks is not monotonically increasing
- the allocation flag is not set as expected
The heap walk will return '0' if the heap seems OK, and one of those symptom code once it has found a block that exhibits the detected problem.
I was just about to contribute the code to the FreeRTOS community when I noticed that there is another heap implementation 'heap_5.c' that works like 'heap_4.c', but supports multiple discontiguous heap blocks. *sigh*; so I want to port my mods into heap_5 also, and submit both. I don't have to -- free code is free code -- but it does seem like I should for completeness.
I also had an idea for another minor improvement, which is to pattern fill as 0xcd only the memory that was explicitly requested, and 0xdd fill the extra padding for the block. This is conceptually straightforward, but I will have to hack the pvPortMalloc implementation, because it modifies it's parameters, and the original size request is gone by the time I do the fill. I wanted to avoid hacking the existing code, but I don't have a choice in this case. But it will be surgical, and anyway it will have no impact if you don't have the pattern fill debug feature enabled.
I used the wrapper feature I described before, and redirected malloc() and friends into the FreeRTOS implementation, and things seem stable. I learned something interesting, though:
- newlib(-nano) has conventional APIs, and also some alternative 'reentrant' versions. The reentrant versions are typically named with a '_' prefix and a '_r' suffix, and the parameter list is prefixed with a 'struct _reent* r' parameter.
- many(!) other libc functions call these versions internally.
So... Until I knew this, I was wrapping malloc(), free(), realloc(), and things seemed to be working from a unit test standpoint, but under normal operations it was still a trip to hardfault land. Last time I wrote about having found the internal sbrk function, which at that time I used to gain some visibility into heap usage, but now I was able to use it to verify that, yes, these alternative heap management functions were still being used, and side-stepping my heap. So, I had to wrap them, too.
My wrappers spec list wound up being:
-Wl,--wrap,malloc -Wl,--wrap,free -Wl,--wrap,realloc -Wl,--wrap,_malloc_r -Wl,--wrap,_free_r -Wl,--wrap,_realloc_r
And I might have to wrap calloc() and _calloc_r(), too, but I haven't verified. Nothing is using those, but be aware that they might need it, unless they are implemented with a call malloc() followed by a memset.
NOW at last things seem stable again.
It's worth mentioning that there is a featurette in FreeRTOS that is activated by configUSE_NEWLIB_REENTRANT which causes FreeRTOS to include a 'struct _reent* r' for each of FreeRTOS' 'tasks', and to appropriate initialize and free that at the correct time. If you use newlib, you'll probably want to activate that feature, because I'm telling you: there's plenty of hidden use of the "reent" versions of the functions in the library implementation.
For those that don't already know, the ancient standard C library has a bunch of global state. That venerable library was created before we had threads, and many Unixians were vehemently opposed to including threads in POSIX. But in the end, the case for threads prevailed, so then what to do about things like errno, strtok, etc.? Newlib does it by wrapping all that state into a struct, 'struct _reent* r', that is meant to be per-thread, and there is a global pointer, _impure_ptr that points to the state for the current one. The '_r' versions take an explicit state reference, and the normal unadorned versions use the global pointer.
The state is pretty big; even for newlib 'nano' -- which puts a premium on size -- the size of that struct is just over 1K, so you'll incur that for each task you create.
Anyway, now things are stabilized once again, it is time to get back to the real business of implementing features.
Next:
I implement features. I think I'm going to start with an improved serial I/O. Currently, polling is used, but I'd rather at least some interrupt driven buffered stream.
-
Wrappers Delight, and Quantum Interpositioning
01/13/2018 at 17:46 • 0 commentsSummary:
I am trying to replace the malloc() implementation so as to get an idea of memory usage patterns. Along the way, I discovered some interesting things about the internals of libc, eLua, FreeRTOS, and some features of the linker.
Deets:
When I last wrote, I was having more of the ever-increasingly-familiar hard faults. I was able to improve that by fiddling with some heap and stack size parameters, but I really needed to use a little more rigor into understanding. At that time, I had two major problems:
1) I had no real visibility into actual heap usage or patterns
2) I could not get an answer from (e)Lua as to it's perspective of memory usageThe second wound up being simpler, so I'll explain it first. When I finally got Lua stabilized enough to execute slightly less trivial statements, I am supposed to be able to get the memory usage by issuing something like:
print ( collectgarbage ("count") )
but all I got was a blank line (hey, this is an improvement of the hard fault I was previously getting). I debugged into this a great deal already using my bogo-binary-search approach, but this seemed something else. Then I remembered two things: Newlib Nano specifically excludes floating point specifiers to printf() by default, and numbers in Lua are all double-precision floats. (The newest Lua -- 5.3, to which I ultimately wish to use -- supports integers (and can be made to support 32-bit integers and floats, which I very much wish to do with this hardware), but I'm stuck on 5.1 for the time being.)
Remembering this, I used a linker command line switch to include the float support in printf():
-u _printf_float
OK! Now I can get some output:
(Those numbers are in kilobytes.) Interesting how it goes up and down. Garbage collection in action. But still not huge enough that I would have run out of my previous 64K heap when I was crashing, so there's more that needs to be understood. What I really want to do is a 'heap walk', so I can see all the blocks allocated to understand better what's happening.
For my first amazing feat, I did some research into replacing malloc() effectively in a newlib project. You'd thing this sort of thing is done all the time, and actually it is. There's a variety of techniques for it.
- Avoidance
By far, the most common technique is: don't use malloc() in an embedded system, you are asking for trouble with deterministic runtime behaviour. - Replacement
Many well-crafted libraries provide a mechanism to customize key features such a memory allocation.
- 'Interpositioning'
A variety of techniques to coax the compiler/linker into doing what you want to do instead of what the author wanted to do.
I appreciate 'avoidance', and if I totally controlled the code, that's almost certainly what I would do, but this (e)Lua is simply beyond my control and I have to accept that there will be dynamic memory allocation requests.
I did discover whilst tracing through the code that Lua provides a very tidy means of fully controlling the dynamic memory allocation scheme: you simply implement a 'frealloc' function, and set that when you initialize your Lua state via lua_newstate() or lua_setallocf() if you already have a state on-hand. Thanks, Lua! I wonder why that was not used? I will probably explore this later, but for now there's still more to the system than just Lua. Amongst other things, malloc() is used in various internal implementations in libc.
The 'interpositioning' techniques are used to cause the target binary to invoke your own code, instead of the originally intended code. These techniques tend to be compiler and linker specific. In this case, with gcc, there are at least a couple things that can be done:
- the linker will 'prefer' to link the first symbol it finds in the order of object modules and libraries it is specified to use. Indeed, eLua has implementations of various things in newlib/stubs.c that override implementations otherwise in libc. This technique effectively hides the original symbol, so you can no longer call the original function.
- the linker has a --wrap command line switch that allows you to transform the original symbol name and all references to it in the already compiled code. This preserves the original symbol name, so that you can still call it. It's a little like a #define, but for the linker.
I tried out the linker --wrap option. In my build system ('System Workbench for STM32', based on Eclipse) it was actually easier to specify the option indirectly via gcc, since gcc is invoked to do the build step as well. The gcc option to pass something to the linker is '-Wl'. I used it to wrap malloc() like this:
-Wl,--wrap,malloc
This causes the original malloc() to be renamed to __real_malloc, and all references to malloc() in the compiled code to be renamed to __wrap_malloc, which you are meant to implement somewhere yourself. In this manner, you are able to 'intercept' all method calls, and delegate to the original implementation if you like.
This is probably not sufficient; you also need to wrap malloc()'s friends:
free, realloc,
and maybe even
calloc, _malloc_r, _free_r, _realloc_r, __malloc_lock, __malloc_unlock, ...Yikes! But the facility is there for you if you need it. You should read the libc source so you know what needs to be wrapped. E.g., while _malloc_r seems to be an internal implementation detail, it is actually invoked directly by some other routines, such as vfprintf (and a whole bunch more). The output 'map' file is useful to verify that all pieces have been successfully wrapped out of existence.
I set down to wrap malloc, so that I could forward those into the common memory allocator that FreeRTOS provides (who wants multiple heap spaces?), but that was brought to an abrupt halt because FreeRTOS does not provide a 'realloc'.
I myself have never used realloc() in many (many!) decades of programming, but I suppose I can see it's attraction. Anyway, realloc can be used as a swiss army knife, adn you can malloc, free, and realloc all with the same function. Lua seems to enjoy that approach, as it makes no calls to malloc(), only realloc() (although it does stops short and does call free()). And because of the content-preserving aspect of realloc, you can't really emulate it with a malloc-and-copy, because you don't know the original block size (unless you dig into the implementation details of the arena headers, of course).
Since FreeRTOS does not provide a realloc function, I punted on this until next go-round. I did find some other buried treasure, though.
While thumbing through eLua's newlib stubs.c, I found a routine '_sbrk_r'. This is the routine that malloc() calls when it needs more space added to the heap. This is normally implemented in libc itself, and I'm not exactly sure why it is re-implemented in eLua (I know eLua provides some alternative allocators, but I'm not using them, so I would have expected that code to be conditionally excluded). This method simply raises a high-water mark from a demarcated section of RAM as per some linker provided symbols for where the free heap starts and ends. When malloc needs more heap space, it calls _sbrk_r to add to it.
I decided to add a little code to that function to flood-fill the free ram with a distinctive pattern (0xfd) so that I could see overwritten memory more easily. Also, I made a symbol 'heap_ptr' public. This allowed me to see what was the maximum heap usage of the total program.
I modified main.c slightly so that when you exit the shell, the memory statistics for stack and heap are emitted, then the board is reset so that you start back in the shell. Here are the results from a few runs using some trivial code:
eLua# exit minfreestack: 862, maxheapused: 2032 of 115072 (minfree 113040) resetting...
So, just getting up to the shell prompt used about 2K ram, and 1024-862=162 bytes of stack. Currently, I have about 115K of heap (heap expands to fit unused memory as per linker symbols). So I really never should have had heap problems before, when I was getting all those hard faults. I suspect it might have been due to competing heap implementations. The hard faults seemed to subside when I switched FreeRTOS to be 'static only' memory allocation.
Next, I tried entering-and-exiting eLua
eLua v0.9 Copyright (C) 2007-2013 www.eluaproject.net eLua# lua Press CTRL+Z to exit Lua Lua 5.1.4 Copyright (C) 1994-2011 Lua.org, PUC-Rio > eLua# exit minfreestack: 746, maxheapused: 8496 of 115072 (minfree 106576) resetting...
So eLua itself bumped heap usage up to just over 8K, i.e. about a 6K overhead.
Next, I tried defining a trivial function. This was mostly to test parser overhead; I suspect the function's overhead is trivial once compiled.
eLua v0.9 Copyright (C) 2007-2013 www.eluaproject.net eLua# lua Press CTRL+Z to exit Lua Lua 5.1.4 Copyright (C) 1994-2011 Lua.org, PUC-Rio > function foo ( f ) f() end > eLua# exit minfreestack: 522, maxheapused: 9692 of 115072 (minfree 105380) resetting...
So that caused another 1K or so to be used (probably temporarily).
Next, I tried a slightly more complicated scenario, this time with a function and a loop and a closure.
eLua v0.9 Copyright (C) 2007-2013 www.eluaproject.net eLua# lua Press CTRL+Z to exit Lua Lua 5.1.4 Copyright (C) 1994-2011 Lua.org, PUC-Rio > function foo ( f ) f() end > for i = 1, 100 do foo ( function() end ) end > eLua# exit minfreestack: 492, maxheapused: 13752 of 115072 (minfree 101320) resetting...
So that took about another 4K over the last run. So, in these simple scenarios, Lua is taking up to about 10K to enter, compile, and run this simple code. I don't know how much is parser overhead, though, because I am not set up to precompile the code. That will have to wait a while, because I evidently need to build a special eLua 'cross compiler' to make the bytecode chunk, and I'll need some filesystem support.
But back to memory, I am now going to try to implement 'realloc' in the FreeRTOS memory manager 'heap4.c' and see if I can successfully wrap all the malloc stuff up, directing it to FreeRTOS.
Next:
Implement realloc in FreeRTOS heap4.c
- Avoidance
-
Shell, Interrupted
01/10/2018 at 17:15 • 0 commentsSummary:
I managed to get the shell receiving data by crowbarring eLua to not install interrupt handlers.
Deets:
Where I left off, I found that my UART receive function was never called. At the time it seemed that the eLua is expecting there to be a separate activity -- perhaps interrupt oriented -- that filled a receive buffer. This turned out to be the case, and moreover eLua wants to install it's own interrupt handlers under conditions unknown to me.
Since I'm wanting to use the STM32 HAL libraries (at this point in time, I may switch to the Low-Level ('LL') libs later), I don't want eLua doing anything interrupt oriented or otherwise down to the metal.
At length, I found a couple defines that cause the interrupt handler to be installed
CON_BUF_SIZE
BUF_ENABLE_UARTThese get emitted into the generated board header (in this case 'netduinoplus2.h'). I tried changing the board def to specify 0 for the buffer length of the serial port, but this cause the build system to break. So for the moment, I manually undef'ed them in netduinoplus2.h to remove the defines. Now the getch() for the console does make it all the way down to my platform receive function, and I get characters!
So, now I need to find if there is a more orthodox means of removing the buffer and interrupt management from eLua. Along the way, I noticed a BUILD_C_INT_HANDLERS define. This seems to come from a board config setting 'cints = true', so I turn that to 'false'. This causes other problems in a BUILD_LUA_INT_HANDLERS, which in turn is derived from a 'lints = true' so I turn that off also. Eventually, this did not fix my console buffer size hack, so I had to leave that in for the time being until I learn more as to what all this stuff is.
At this point, I am able to run eLua, but I can't do much. I can run things like:
print ( "Hello World!" )
for i = 1, 10 do print ( "Hello" ) end
but more complex things, like:
print ( collectgarbage ("count") );
winds up in the HardFault handler. *sigh*. Surely I am not out of heap on these? Well, I did reduce the heap artificially to 64k 'just cuz'. I also changed FreeRTOS to 'static only' so that I could deterministically see it's use. FreeRTOS also has a heap implementation that lets me see some statistics such as 'what was the least amount of heap available, ever'. The trick is to get all this other code (eLua and libc) to use it, whereas they are coded (and in the case of libc, already compiled) to use 'malloc'. So I'm going to look into some magikry to redirect those calls into my (er, FreeRTOS') malloc implementation.
Next:
Try to replace memory management.
-
Shilly-shally with the Shell by the Seashore
01/08/2018 at 03:13 • 0 commentsSummary:
I first try to get a UART up to provide some I/O for the shell. Half seems to be working.
Deets:
For my first step, I am going to simply get a serial port up with the System Workbench HAL drivers, and attempt to wire that into the eLua. This should give me a serial console, if it all works out. This will be an interim implementation, though, because the selection of those pins to that port will be hard coded int the firmware. I believe the ultimate goal is that the pin configurations will be assigned at runtime, under Lua control. Nevertheless, the experience should help guide me in how to do that, and anyway I really need some I/O now.
As a simplification for now, I simply configure STM32CubeMX to designate the (user-visible) D0 and D1 pins to be UART -- specifically USART6 (because that's how the board is wired) and emit init code for that. Then I will tweak the eLua board def to emit a header indicating that eLua's notion of 'UART 0' will be the serial console.
This is not how it will ultimately work. I'm expecting that in the end, all the pins will come up in tristate or analog mode, and then the Lua application code will 'open' the various devices, which will cause the pins to be configured at runtime in the associated manner. I.e., D0 and D1 would be useable either as UART or digital IO as per application; not hard-coded to be serial as I'm doing now. But I've got so much more to learn and do before I can implement that fanciness.
Anyway, the STM32CubeMX part is trivial. Now I have to wire it in.
I took several quick trips to Hard Fault land, and after spending many, many, hours (days?) with my 'binary search' approach to finding the offending code, I decided to put forth a little effort towards getting more info in those cases. Hard faults on the ARM cause context to be dump to the stack before vectoring to the handler, but Eclipse does not know how to present that information, so you don't see the usual stack trace if you set a breakpoint in the handler. I found some code on FreeRTOS's site for some simple information gathering, but it didn't work as-is. It involves inline-assembler, and I guess my compiler (gcc) is slightly different than whatever they were using (which I would have thought would be gcc, but whatever). My problem was extremely simple: I couldn't branch to a subroutine via a register -- the address loaded was always incorrect. I did manually change the register to the correct address, and verified the other stuff worked as expected. After yanking-and-twisting for a while, I did get the handler working, so I post it here for posterity:
void prvGetRegistersFromStack( uint32_t *pulFaultStackAddress ); __attribute__( ( naked ) ) void HardFault_Handler(void) { /* USER CODE BEGIN HardFault_IRQn 0 */ //XXX there needs to be __attribute__( ( naked ) ) void HardFault_Handler(void) //XXX but the code generator will probably overwrite that. Verify that has not //XXX been lopped-off before proceeding, lest your stack references be off. __asm volatile //'volatile' to prevent gcc from rearranging them ( " tst lr, #4 \n" //test EXC_RETURN number in LR b2 " ite eq \n" //if zero then " mrseq r0, msp \n" //Main Stack, put MSP in R0 " mrsne r0, psp \n" //Process Stack, put PSP in R0 " ldr r1, [r0, #24] \n" //get fault stack address " ldr r2, =prvGetRegistersFromStack \n" //call through reg to avoid messing " bx r2 \n" //with params already in regs ); /* USER CODE END HardFault_IRQn 0 */ while (1) { } /* USER CODE BEGIN HardFault_IRQn 1 */ /* USER CODE END HardFault_IRQn 1 */ } void prvGetRegistersFromStack( uint32_t *pulFaultStackAddress ) { //'volatile' to avoid the compiler/linker optimising them away volatile uint32_t r0; volatile uint32_t r1; volatile uint32_t r2; volatile uint32_t r3; volatile uint32_t r12; volatile uint32_t lr; // Link register. volatile uint32_t pc; // Program counter. volatile uint32_t psr; // Program status register. r0 = pulFaultStackAddress[ 0 ]; r1 = pulFaultStackAddress[ 1 ]; r2 = pulFaultStackAddress[ 2 ]; r3 = pulFaultStackAddress[ 3 ]; r12 = pulFaultStackAddress[ 4 ]; lr = pulFaultStackAddress[ 5 ]; pc = pulFaultStackAddress[ 6 ]; psr = pulFaultStackAddress[ 7 ]; //When the following line is hit, the variables contain the register values. volatile int n = 0; for( ;; ) ++n; }
some related info:
It's still a little tedious to debug, but far less so when you know the fault address, because you can at least then lookup the function via the generated map file.
I still had a bunch of faults happening, seemingly non-deterministically, so I increased the stack size for starters. This didn't seem to help, so I turned on some features in FreeRTOS that help to determine stack usage, which was quite low. Then, the hard faults stopped happening altogether, which rather put a damper in debugging. Non-deterministic bugs are so disheartening!
Without being in a position to debug anymore, I decided to motor on. I found a spot where uart IO ostensibly occurs, and simply implemented it using the 'polling' HAL calls. These do IO in the most trivial way: polling on some sort of 'busy' flag for every byte send or received. I don't intend to do it this way in production, but this is just a quicky for sanity checking.
After hooking up a trusty FTDI, I get my first results:
Ta-da! But the joy was short-lived; it does not seem that I can type into the system. I don't know why this is, but I'm guessing that I'm simply clueless as to how the eLua expects platform adaption to be down. I do a bunch of the usual single-stepping of code, and it seems that the system is running correctly, but is expecting to push data directly out of the port (and that is obviously working), but pull data from a buffer, instead of directly from the port. So I guess the write side is write-through, whereas the read side is buffered, with the expectation that there is some separate process (or interrupts) that produce data into that buffer.
I still like the eLua, and I'm especially glad that they have produced modules for the various common peripherals, but the platform abstraction so far does not seem to afford quite as clear a separation as I would like. But maybe I'm just ignorant -- that is certainly possible. It has two components: a 'cpu' component (under platform), and a 'board' component (that seems to be the specs for an automatically generated header) under 'boards'. All that sounds great and conventional, but if I'm having to fiddle with aspects of the common stuff, then that means the separation has blurred a bit, and some carnal details of the platform-specific implementation are exposed higher up.
E.g. interrupts: there's a few places where the system wants to disable interrupts (presumably globally). In the few places I analyzed this, it was to create a critical section around a variable that was being modded. That code presumes that you can globally disable interrupts, and that the disable method is able to tell you what the state was before making the change. This is actually inconvenient for me, because FreeRTOS does /not/ provide a means to tell you if they were or were not disabled prior to making a call to taskDISABLE_INTERRUPTS, or taskENTER_CRITICAL, and even taskENTER_CRITICAL_FROM_ISR returns something non-portable and which I'm not yet convinced can be used here safely anyway.
Ultimately, I think I'm going have to little higher into the eLua system. It doesn't need to know about interrupts (for the shell part, at least), or buffering techniques, etc. It should simply require a data stream that can read() and write() from whatever source, and leave those implementation details to the lower levels. But when I get to implementing the peripheral modules, I'll probably have some very different thoughts, but this shell I/O stuff should be abstracted a little be higher up.
But in the near-term, I really need to get my receive side of the uart for the shell working. Then I'll be cooking with Crisco.
Next:
Need to get the receive side of the UART attached to stdin working.
-
The Crash of the Titans
01/06/2018 at 19:54 • 0 commentsSummary:
Before getting down to the business of gluing eLua to the System Workbench STM32 HAL libraries, I find some sources of crashes, and fix them.
Deets:
After finally getting a build to complete, I let it run and forthwith wind up in the Hard Fault handler. Again, this is not great surprise (I'd me more surprised if I /didn't/, since I haven't wired any platform code to do IO). Stopping in the Hard Fault handler does not admit to a stack trace in the Eclipse tools, alas, so I incrementally zone-in on the faulting line the old fashioned way by doing a coarse depth-first search with breakpoints. It takes a while, but at length I found the fault to be stimulated by a call to getenv(). This call was being made while loading libraries -- naturally with my luck the last library to be loaded: 'package', where it was trying to get the LUA_PATH and LUA_CPATH env vars. It has logic to handle the 'variable not found' case, but getenv() itself was crashing prior to that.
getenv() doesn't make much sense in embedded, since there is no OS or shell, but the standard library implementation (newlib-nano in this case) exposes it and that does make porting simpler. grepping shows that the (e)Lua code has plenty of calls to getenv() throughout, so I look deeper. The System Workbench does not ship the libc source, alas, which is a real pity, so I look for documentation.
Fun fact: System Workbench does install (some) library documentation, but it does not link it in the Start Menu, or make it particularly visible. In my case I found it at:
C:\Ac6\SystemWorkbench\plugins\fr.ac6.mcu.externaltools.arm-none.win32_1.15.0.201708311556\tools\compiler\share\doc\gcc-arm-none-eabi\pdf\libc.pdf
Obviously, right? Anyway, the documentation mentioned that it requires a global variable 'environ' to work. Since I successfully linked, I must have the variable somewhere, so that's not it. I looked for the source via web search and found it:
https://github.com/eblot/newlib/blob/master/newlib/libc/stdlib/getenv.c
However that was not particularly interesting because it was just a wrapper around an internal function '_findenv_r':
char* _DEFUN (getenv, (name), _CONST char* name) {
int offset;
return _findenv_r (_REENT, name, &offset);
}I may later download the source package, but it will be not as useful as I might like (interactive debugging), since it will not be the version used to make the shipped binary libs anyway. But much better than nothing. In the meantime, I was bored with this, and decided to interactively look at this 'enviorn' variable. I declared it 'extern' and then was able to inspect it via the debugger and see that it exists, and that it points to a single entry of NULL. I would not think that to be a problem, it's just an empty environment, but I decided to make a new environment list consisting of two entries: an empty string, and a NULL pointer (to terminate the list). This worked fine.
I don't know what this means, there's no similar code in the eLua, but it is using a different libc, so maybe this is a bug. If so, it could have easily have been missed, because who uses 'getenv()' in an embedded context except for ported desktop code (which Lua is).
Happy that I had solved that crash, I let it fly again, and it crashed again. This time in an fprintf ( stderr, ... ) call. Again, I'm not too surprised because there has not been any standard objects created, though I would more expect that it would simply direct to the functional equivalent of /dev/null instead of crashing.
Well, after many hours of stepping through assembler (because I don't have libc sources that match the binary libs), I popped out back into user C code! The eLua code already had overridden the 'bottom edge' to redirect IO to peripheral devices. Of course, I haven't implemented anything in that area, so no small wonder it crashed. It's just a pity that so many hours were consumed tracking that down.
Anyway, it's now clear there is some more eLua initialization that needed to be performed. I added a bunch of that stuff, and now the system does not crash. It also doesn't do anything useful, since there is no I/O, but at least it seems that the system is running in a consistent state.
So, it's probably time to start working on peripherals. A good first choice is UART, so we can get some console I/O going...
Next:
Now it's probably time to work on peripherals for real.
-
ROMfs and Remus, and the Founding of Rome
01/05/2018 at 23:29 • 0 commentsSummary:
Made a separate tool to generate 'romfs'.
Deets:
I set down to reverse-engineer the Lua-based build system to see where the 'romfs' is generated. Happily, this wound up being a relatively easy exercise, performed by a deviously named routing 'make_romfs' and an auxilliary module named 'utils.mkfs'. This step is executed later in the build process, and was being blocked by the toolchain detection stuff. I eventually just decided to cruft together a bespoke tool that performs that single step, rather than fooling around with trying to hook in the System Workbench toolchain at this juncture.
I copied build_elua.lua to a new file build_romfs.lua and then commented-out everything except for that top-level routine: make_romfs(). Then I ran it with Lua5.1 and observed how it croaked. I then incrementally added things back until it stopped croaking (and took out a couple things that I know I didn't need that wired into the build system) and wound up with a minimal tool that generates the file.
The build_romfs ostensibly can operate in three modes:- 'verbatim', which copies all files as-is into the image
- 'compressed', which runs the files through a minizer first, before coping into the image
- 'compiled', which compiles them to bytecode first, before coping into the image.
The first two are straightforward, and are supported. The last one is useful, but not supported at this time because I have to build a special version of the compiler 'luac.exe' to make this work. This feature is expected to be beneficial because the parser for the Lua interpreter apparently can be RAM-intensive, and so it would be beneficial to avoid the compilation step on-device. However, this will be some work on my part to get running, and if I am to ever get the patches in eLua ported forward into 5.3.4 (as I want to), then I'd rather tackle mess with a special luac after that happens. This device should have enough RAM to allow me to at least continue the evaluation without prematurely performing that optimization.
OK, so the build_romfs tool emits a single file: a header which defines a C array that is the filesystem image. It is included in one module only (as it must since the definition is in the header), so I simply copy it over and start the build again.
To my delight (and no small surprise), the build finished compiling. There are a few warnings, but they are minor. It does not link, however. I need to pull some more source files. These appear to be platform support source files, so I guess platform_int.c is not the sole interface. (In retrospect, I now know that this is just for interrupts. There are many others for the other peripherals.)
This will probably be an incremental process, because I imagine that the board configuration will define what modules are actually needed. I start pulling the currently required ones in, and stubbing the implementation out as I did for platform_int.c.
I needed to add the uIP library, and several other modules, but most of the work was the usual 'stub out implementation connecting to the platform' that was done in platform.c. There is going to be a lot of work to do here, and I'll have to figure out what is the actual public interface. I also had to twiddle the linker script to emit a few symbols that the eLua relied upon to assess flash usage and the extent of the read-only area for something related to strings.
In the end, I was able to once again finish linking, and I got these size results:
-Og
194872
-Os
180568
-O0
272336
-O1
194496
-O2
194696
-O3
239120So we still have plenty of room for more code. I don't know how much more will be required, but I don't think it will be much. I suspect the networking and filesystem library code was evicted at link time, so that could add a chunk. Also, if I include SSL support, that will incur a big chunk - maybe 300k or so (based on some previous experience with mbedTLS).
I have no idea about RAM usage, but I did not include the memory allocators from eLua because I have some from FreeRTOS. This chip has two memory regions, so I need to think about how I would want to use those. One is 128 KiB, so I'll pretend that's all I've got for the moment.
I'm sure none of this build works, as was the case before, but it does mean that I've got the eLua source building, along with all the peripherals modules that it adds. This is a good thing. Now I need to understand the platform.c and platform_int.c, which is where I need to glue the eLua to the System Workbench code.
Next:
Attempt to implement the platform-specific glue logic for the system and all the peripherals.
-
Mass o' Masochism
01/04/2018 at 16:11 • 0 commentsSummary:
Continuing eLua toolchain setup masochism. I got it mostly compiling, but there is another build tool challenge with 'romfs'.
Deets:
Continuing the self-flagellation with the build system masochism yesterday, my freshly healed wounds are ready for more abuse. But armed with my newfound experiences, I find I have grown stronger in the broken places.
First I needed to get the Lua-based build system up, which entailed getting Lua 5.1.x up (the scripts as-is are not 5.2+ compatible), getting Microsoft Visual Studio installed (required for Lua Rocks when you have to build some modules from source, as we will here), and Lua Rocks.
1) Visual Studio. I installed the free 'community edition' VS 2017, and made sure all the C/C++ stuff was selected to be installed. This is a long, slow process. Oh, you also have to register some account -- don't know what it is for, but I just bound it to my usual dev email.
2) Get Lua 5.1.x up. This wound up being simpler than what I was doing yesterday, because it turns out that Lua Rocks comes packaged with a Lua 5.1 if you don't have a Lua of your own. So that saved a step.
3) Get Lua Rocks installed. This was a little klunky -- certainly not the drop-kick of, say, nodejs and npm, but I was armed for combat after yesterday's boot camp training.
For the curious, setting up the Lua Rocks consisted of:
- start an administrative command prompt. Yes, admin, you'll need it.
- setup the Visual Studio variables, so the compiler can be used. There is a batch file in the installation that will do this for you. It's deep down, so I'll give you the path for my system, which is probably the same as yours, or only trivially different:
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvars32.bat"
Realize that these vars only last for the current command prompt session; you'll have to invoke that each time you create a new command prompt, so don't close it before the next step. - install Lua Rocks. You can download it from the Internet - it's just a zip file of stuff. Amongst that stuff is an 'install.bat'. OK, you're going to need a few options. Feel free to run 'install.bat /?' to see all the options, but spoiler is you need '/L /MSVC' which will tell it to install the included Lua (conveniently for our needs version 5.1), and also tell it 'trust me, use the Visual Studio'. The install system tries to be clever, but for some reason it is not clever enough to understand VS2017.
- add some variables to the environment:
To PATH, append
';C:\Program Files (x86)\LuaRocks'
To LUA_PATH, make it equal, or include
'C:\Program Files (x86)\LuaRocks\lua\?.lua;C:\Program Files (x86)\LuaRocks\lua\?\init.lua;C:\Program Files (x86)\LuaRocks\systree\share\lua\5.1\?.lua;C:\Program Files (x86)\LuaRocks\systree\share\lua\5.1\?\init.lua;;'
note the trailing double semicolons!
To LUA_CPATH, make it equal, or include
'C:\Program Files (x86)\LuaRocks\systree\lib\lua\5.1\?.dll;;'
again, not the trailing double-semicolons! - Fun: since you have altered the environment variables spec, those changes will not be reflected into your current command prompt session. You can either manually set them there, or you can close-and-reopen the command prompt. If you do close-and-reopen, make sure once again that it is 'admin', and also set the Visual Studio environment variables as mentioned earlier. Whee!
- Install the required rocks:
luarocks install luafilesystem
luarocks install lpack
luarocks install md5
The first and last will require Visual Studio to compile the C-side code, so you should see soem of that happening. Once you've got these three rocks installed, your pre-requisites are set up. - Note: the Lua packaged with Lua Rocks is named 'Lua5.1.exe' so that's the name you must use to run it on the scripts, rather than the typical 'lua.exe'.
You won't need the admin console or Visual Studio for the next things, so you can close the current command prompt if you like. (or not)
For my next amazing feat, we invoke the build system. At this juncture, I'm just wanting to have it spew out the machine-generated header that I was having trouble with yesterday. But there is a little bit more to do.
The build system, at a minimum, is invoked thusly (according to handy comments therein):
'lua build_elua.lua board=MIZAR32'
which in our case translates to:
'lua5.1.exe build_elua.lua board=MIZAR32'So, we will need at a minimum 'board' specification, and indirectly a processor specification. Rummaging through the source tree I found some buried treasure:
- 'elua\boards\known\netduino.lua'
- 'elua\boards\known\netduino-2.lua'
Hmm! Someone has had this thinking before! However, these are not the ones we need, we still have to make a new one for the Netduino Plus 2. The first one is for an Atmel processor, and the second one is for an STM32, but a different chip, and different board definition. The second is logically closer to what we need, so I copied that to 'netduinoplus2.lua', and made a couple edits. I don't know what I'm doing yet, so I kept the edits to a minimum.
First, I know that the processor is wrong. The netduino-2 uses an STM32F205, and we need a STM32F405. There is not a chip support for the '405, but the '405 is similar to the '407', so I changed it to that:
cpu = 'stm32f205rf',
Also, I changed the 'clocks' since I know this board runs at a higher rate:
clocks = { external = 25000000, cpu = 120000000 }
Those are the only two things I dared to change yet, until I understand the other stuff more fully.So, I invoke the build system (from the 'eLua' project base directory):
lua5.1.exe build_elua.lua board=netduinoplus2
and, presto! it generates a header file, and then immediately fails to carry on because of lack of known toolchain. I expected the later, so I am actually pleased with the progress. (Later I may see if I can somehow wire in the System Workbench command-line tools; it is based on gcc by 'Linaro', and seems similar to the 'codesourcery' ones that the eLua build system understands.)I pluck my freshly minted header from:
elua\boards\headers\board_netduinoplus2.h
and put that into position in my System Workbench project, and clean, refresh, build (oh, if you haven't every used Eclipse before, you will become familiar with that sequence. Eclipse tends to lose it's marbles easily, and give really weird compile errors. This especially happens when adding/moving source around. Too much caching!)The build fails spectacularly in
NYETduinoPlusLua\Src\elua\src\platform\stm32f4\platform_int.c
but at least not on missing the header! Rummaging through that file, this seems to function at the edge between the eLua C code, and the chip-specific library code.OK, time to kvetch a little (but not nearly as much as yesterday!). The eLua project includes copies of old libraries. As mentioned before the Lua is almost 10 years old (two significant api-changing releases behind), and also the 'fatfs' is old, the 'uip' is old, and the 'STM32 peripheral libraries' are old, old, old. Also, I am not going to use the STM32 peripherals library, but rather the alternative 'HAL' libraries that are emitted by the STM32CubeMX and apparently are favored go-forward by STMicroelectronics. (I actually like the older libraries better from a code quality standpoint, but the project that I will be migrating this work to will be using the new libraries, so that drives my decision).
So, back to platform_int.c errors. I stubbed out the implementation of nearly every function in that module just to get it to compile so that I can carry on assessing what the damage looks like for the rest of the project. To my delight, the rest of the project compiles up to a point. In a way this makes sense if platform_int.c is the sole interface to the native chip support libraries, and I certainly hope this bears out to be true, because that will be happy news indeed: less work == more play! Live like a caveman!
The build stopped at a point in 'romfs', complaining about the absence of another header:
#include "romfiles.h"
and later in that source file I see some missing symbols for:
romfiles_fsOK, so there is more build system generated code. I guess when it croaked on the absent toolchain, it did not get to the step of creating that stuff. I think I know what it is, though. The eLua system supports a read-only image of a pile of Lua scripts that it builds into a bespoke filesystem, and converts that to literal C data for inclusion. Now I will need to study how that tool works so I can invoke it manually, or else I will need to figure out the toolchain stuff sooner than later.
Next:
Figure out what to do about the 'romfs' image generation.
-
eLua and eGads!
01/04/2018 at 01:29 • 0 commentsSummary:
I tried a simple and naive inclusion of the eLua source. This was not successful. More effort will be required.
Deets:
Encouraged with how trivially simple it was to integrate the canonical 5.3.4 source into my project, I tossed all that and attempted the same with the eLua source. More precisely, I included the eLua's copy of the canonical Lua source in the same way. This didn't work at all, there were other needed modules from elsewhere in the source tree. That's not a complete surprise.
I took a look at the diff of the eLua 5.1.4 source and the canonical 5.1.4 source, and there are extensive modifications (presumably mostly embodying the so-called Lua Tiny RAM ('LTR') patch) So I decided to defer studying that patch to see how to mod 5.3 for now. I'll just settle for 5.1.x to get that running.
Remember how I was saying that I was hoping to not have to get into learning another build system, and just include the source directly? Well, I can kiss that dream goodbye for several reasons:
- There's generated code. So I've got to understand the build system to understand what is being generated and put where. So far at least it seems only to be generating some headers out of... stuff! But I have to find all that stuff, wherever it is, and get it into my build system (i.e. 'System Workbench for STM32'). This is time-consuming and quite boring.
- There's generated include and source paths. It's a personal peev of mine when #include "xxx" means 'xxx is somewhere, you'll have to find it and add the path to your includes' as opposed to 'xxx includes an explicit relative path to this file' so I don't have to do that, and I don't have to reverse-engineer the build system to figure out where it is and where it is set.
Anyway, maybe I'll bit the bullet and try to run the build system a little at least to make it emit the generated files, and carry on from that point. The build system won't work to completion, of course, because the compiler toolchain is not visible to it. - The build system is 'scons', as per project documentation, which is based on python. This is yet another 'make' replacement, apparently in the cmake tradition of puking out toolchain native project files based on some configuration. I really am not in the mood to install python just to drive a build system. Also, the documentation refers to some configuration that is contained in 'SConstruct', but I cannot find this file. After rummaging through the source tree, I think I figured out why...
- The build system is Lua, as per reality. At some point, the developers may have come to a similar viewpoint regarding the python dependency, and they created their own (I think) build system based on Lua scripts. I might have been nice to have updated the project documentation to that effect (e.g. on the page http://www.eluaproject.net/doc/v0.9/en_building.html[Building eLua]!)
OK, I like to kvetch. Here's another one: so, I build my Lua. Then I install the package manager, LuaRocks. So, I have to rebuild my Lua, and move it to some apparently conventional tree structure, then I install the package manager, LuaRocks. OK, so I have to install Visual Studio and rebuild my Lua, and install it to an apparently conventional tree structure, then I install the package manager, LuaRocks. OK, so I have to put a bunch of command-line overrides on the install.bat to LuaRocks, because for some reason it wants to deploy it's 'rocks' into c:\, and it's necessary to have the Visual Studio build environment setup before invoking the installer, and you need to tell it 'don't try to figure out what Visual Studio is installed, you won't do it right, anyway'. Sweet Jesus! But at length I have a presumably usable Lua, and I got it's rocks off 'luafilesystem', 'lpack', and 'md5'.
Now, from the cleverly named 'build_elua.lua' and handy comments within, I am able to try to invoke it.
Fail. I have to fiddle with some environment variables, and have to non-intuitively (though perhaps obvious to an experienced Lua user), be sure to put two consecutive semicolons at the end of LUA_PATH, since that apparently means 'oh, also look for modules in the baked-in default locations'. And then...
Fail. "lua: .\utils\build.lua:3: attempt to call a nil value (global 'module')" I am not sure what this is; I'm going to have to reverse engineer the build scripts. I have a suspicion that this might be a garden-variety 5.1 vs 5.3 issue, though. Yes, in retrospect, perhaps I should have built and installed 10-year-old version of Lua on the build system, instead of the current version, what with the old version also being the source for the build targets themselves.
Sigh. Well, at least Lua proper is easy to build. And I am now an expert getting the Lua Rocks off the Internets. But this has been a whole day's work, and that does not make me happy. But these are the vagaries of working with other people's code, and I'm sure the system has some charms I simply have failed to appreciate at this point....
Next:
More build system masochism.
- There's generated code. So I've got to understand the build system to understand what is being generated and put where. So far at least it seems only to be generating some headers out of... stuff! But I have to find all that stuff, wherever it is, and get it into my build system (i.e. 'System Workbench for STM32'). This is time-consuming and quite boring.
-
Ye Olde Build System Setup
01/03/2018 at 00:32 • 0 commentsSummary:
Setting up the build environment, and debugger.
Deets:
For my first amazing feat, I get the debugging pod working, and verify that I can program the device.
I have done this before, so for the moment I'll just link a reference to another project where I went into the details of such at greater length, q.v.:
https://hackaday.io/project/25616/log/62106-setting-up-build-environment
https://hackaday.io/project/25616/log/62150-torment-and-torture-by-tortuous-tools
https://hackaday.io/project/25616/log/62581-tool-everything-tool-terrible-two-openocd-rift
https://hackaday.io/project/25616/log/62656-serial-killer-the-sorrow-of-sad-sorry-serial-stuffST Microelectronics ('ST') has a 'wizard' style tool called 'SM32CubeMX' which lets you select peripherals, pinouts, clocks, etc., and then it generates a skeleton project with all the relevant libraries referenced and initialized. The code generated is oftentimes of dubious quality, but it certainly is convenient, and the device has a large flash, so I usually opt for the convenience for starters, then optimize if needed later with custom code.
I created a boilerplate config file for ST32CubeMX describing the Netduino Plus 2 by meticulously combing through the schematic and reflecting those clocks, pin assignments, and peripheral choices into the description board. I saved this off as 'NetduinoPlus2.ioc' which can be used as a basis for starting other projects for that board. Then I made a copy for the NYETduinoPlusLua project and generated the skeleton.
I build the project and verified that I could flash the firmware and step through the code. It does nothing - not even a 'blinky', but that's good enough for me since I've worked with this toolchain before.
Eager for an early 'go/no-go' on Lua, I decided to simply dump the Lua source into the project, and build it and see what happens. This is the official Lua source, version 5.3.4 specifically, and it is structured simply (just a bunch of source and headers in one directory. I copied that stuff over and included it all (except for luac.c, which is the stand-along compiler) and built it.
To my great surprise, it compiled with not a single error/warning. This is very rare, but Lua is known for it's ease of integration.
When built as non-optimized for debug (-Og), arm-none-eabi-size "NYETDuinoPlusLua.elf" reports:
(-Og):
text data bss dec hex filename
157756 560 2912 161228 275cc NYETDuinoPlusLua.elfFor completeness, and for the curious, I built it with the other optimization levels, as well, and have included their final statistics:
(-Os):
147832 560 2912 151304 24f08 NYETDuinoPlusLua.elf(-O0):
217364 560 2912 220836 35ea4 NYETDuinoPlusLua.elf(-O1):
157660 560 2912 161132 2756c NYETDuinoPlusLua.elf(-O2):
160292 560 2912 163764 27fb4 NYETDuinoPlusLua.elf(-O3):
201324 560 2912 204796 31ffc NYETDuinoPlusLua.elfSo that's pretty compact! Lua is known for being a 'bare-bones' system out-of-the-box, so this doubtlessly includes nothing other than the interpreter, runtime, and a few standard libraries. But it does look like there is breathing room to add stuff on this device, which has 1MB flash. I have not idea about the RAM usage at this point, though. There's a little bit of initialized static/global data (.data) and uninitialized (.bss), but who knows about runtime heap usage.
On a lark, I wired in it's main, passing dummy parameters for the expected argc, argv, and stepped through it. It did allocate and initialize the environment, so I let it go and it made a quick trip to the fault handler. I'm not really surprised by this, since surely there would be a bunch of stuff needed to be customized -- if nothing else stdin and stdout. This was just a quick smoke test, and as such I'm not going to try debugging into it.
For my next test, I am goign to try to do something similar with the eLua source. This source is for an much older Lua: 5.1.4. I really want 5.3.x, but I'll try to get the old stuff working for real first, and also assess what is needed to upgrade to 5.3.x. Maybe the eLua maintainers are holding it back for some good reason.
Next:
Try a smoke-test with eLua source