-
More refactoring
11/05/2016 at 16:42 • 0 commentsFor supporting the use case of writing dictionary words to Non Volatile Memory and defining the startup behavior more refactoring was necessary. There are again a couple of points where I could improve the readability of the code while reducing the footprint (both Flash, and RAM). Some of the changes simplified context switching for background tasks. The new code is on GitHub, and in the files section of this project page there are new hex files.
-
Reviewing SDCC integration
11/01/2016 at 08:43 • 0 commentsBefore I pushed the optimized code to GitHub, I had a second look at how to play nice with SDCC variable initialization. There is very little linker documentation (which is hidden in the SDCC binary) and its way of allocating memory. Fortunately a lot can be learned from the SDCC source code!
One of the problems I had to work around is to figure out where the INITIALIZER area ends unless I figure out how to use the local symbols s_INITIALIZER and l_INITIALIZER in forth.asm.
My current work-around is to get the address of the INITIALIZER area in forth.asm and hope that it's the last file to allocate space there. Maybe I can influence this by putting forth.rel always at the end of the SDCC call in the Makefile.
; nvdat ( -- ) ; Data about Non Volatile Memory .dw LINK .ifne WORDS_LINKCOMP LINK = . .db (5) .ascii "nvdat" .endif NVDAT: CALL DOVAR .dw LASTN ; Init for LASTN .dw END_SDCC_FLASH ; init for USRCPNVM (free Flash starts here) .dw USRPOOL ; Next RAM cell for indirect variables .endif ;=============================================================== LASTN = LINK ;last name defined .area CODE .area INITIALIZER END_SDCC_FLASH = . .area CABS (ABS)
-
When less is more
10/30/2016 at 12:42 • 0 commentseForth was designed to facilitate creating a Forth environment for a target computing architecture by someone who is fluent in assembly but new to the world of Forth (with fancy things like meta-compiling, extensible syntax, and multi-user environments).
This goal was achieved by coding a small number core words in assembly, and building interpreter, compiler, and all the rest out of those. However, many words are only useful for extending interpreter or compiler, or for create new structural words like SWITCH and CASE. When embedding Forth into a computing device this flexibility is rarely ever needed.
There are now new configuration flags to choose which of the base words for interpreter, compiler, or I/O and string handling should be visible in the dictionary. Most beginner won't need them, and an advanced user will know when adding them to the binary is required. There is a double benefit: a reduced footprint, and better usability!
In the default configuration the W1209 dictionary now shows the following words:
RAM NVM ADC@ ADC! OUT! P7S E7S BKEY LOCKF ULOCKF LOCK ULOCK 2C@ 2C! BSR 0= WORDS .S DUMP VARIABLE CREATE IMMEDIATE : CALL, ] ; ." ABORT" AFT REPEAT WHILE ELSE THEN IF AGAIN UNTIL BEGIN NEXT FOR LITERAL C, , ALLOT ' [ NAME? \ ( .( ? . U. U.R .R CR TYPE SPACES SPACE NUF? KEY DECIMAL HEX ERASE FILL CMOVE @EXECUTE PAD 2@ 2! +! PICK 2/ 1- 1+ 2* 2- 2+ */ */MOD M* * UM* / MOD /MOD M/MOD UM/MOD WITHIN MIN MAX < U< = ABS - DNEGATE NEGATE NOT + 2DUP 2DROP ROT ?DUP BG TIM -1 1 0 BL OUT BASE UM+ XOR OR AND 0< OVER SWAP DUP DROP >R R@ R> C@ C! @ ! EMIT ?KEY COLD
For the beginner this is much more readable, and removing 79 base words from the dictionary freed up almost 700 bytes for the application (the code is still there!).
But that's not all: I also rewrote some of the core code to take advantage of SP based addressing modes. This reduces Flash and RAM usage, makes the code cleaner, and also reduces the time for a context switch in background operation.
Now the bare-bones CORE target for an interactive Forth system requires just 4690 bytes, and the W1209 binary with background operation, compile to Flash, and 7S-Display fits in less than 5700 bytes (down from more than 6500 bytes).
I also looked into swapping the X and Y registers, which should free up about 170 bytes, but I won't do that until I have an automated regression tests for Forth core words.
-
Forth to Flash
10/29/2016 at 11:16 • 0 commentsAs outlined in the previous log entry, compiling Forth to NVM requires some changes. I had to refactor some code (e.g. the way PAD works in a background task) but now I can compile words to Flash, and use them together with a volatile vocabulary in RAM.
Did I mention that this is a great way to do automated tests on different integration levels from unit tests to SW-HW-integration?
Now I did.
By the way, I just pushed the changes to GitHub.
-
Towards compiling Forth to Non Volatile Memory
10/23/2016 at 20:26 • 0 commentsIn many cases, native Forth environments compile to RAM, and the original STM8S eForth is no exception. In theory it's easy: just write the code into Non Volatile Memory (NVRAM, Flash) instead of RAM, but in practice many µCs either don't have that, or they can't write to the Flash memory area from which code is executed.
Here, the STM8S architecture has an advantage: writing to Flash is very easy. After unlocking the memory areas writing to it with simple LD or LDW instruction just works (the memory write control automatically adds wait-states).
The problem that remains to be solved is that Forth uses two RAM areas just behind the user defined words as temporary storage (the interpreter uses some bytes for looking up words, and the so called PAD area for building output strings). In eForth, compiling words to other memory areas (e.g. NVM) also moves the temporary storage. The idea of writing temporary data to Flash just doesn't sound right, even if it works.
The second problem is concatanating the core dictionary with user defined dictionaries in RAM and NVM. Some design decisions have to be made, e.g. should extending the NVM dictionary be possible, or is re-writing from scratch sufficient?
I now have a nice solution for the first problem, but I'm not satisfied with what I have for the second problem. Stay tuned.
EDIT: Code execution from EEPROM leads to hardware reset. An earlier version of this log wrongly assumed that RAM and EEPROM have the same execution properties! -
Recursive functions in Forth with STC
10/19/2016 at 21:33 • 0 commentsSTC, or Subroutine Threaded Code, is a Forth coding technique that implements the Inner Interpreter through simple CALL instructions. This convenient coding technique, used by STM8EF, comes at a price:
- increased code size: CALL + word address,
- some of the more nifty Forth features don't longer 'just work'
The code still is compact, no problem with that. It's the nifty features not working what bugs me ;-)
The Forth decoder SEE I fixed early on (mostly). The lack of the "Pearl of Forth" DOES> (which allows things akin to object oriented programming) bugs me much more.
In pursuit of perfection (and of code compilation to Flash) I experimented with fixing RECURSE, which, unsurprisingly, allows for writing recursive routines. Fixing it isn't difficult though: injecting CD, the opcode for a CALL instruction, into the generated code is sufficient.
The following snippet extends adds recursion to our humble STM8S Forth implementation. With the help of the new word RECURSE (which puts a call to the currently defined word into the generated code) we then define the well known recursive implementation of the Fibonacci function. Lastly, we call the function for the numbers 9 through 0, and print the result.
HEX : RECURSE last @ NAME> CALL, ; IMMEDIATE : fibonacci DUP 2 < IF DROP 1 ELSE DUP 2 - RECURSE SWAP 1 - RECURSE + THEN ; ok : fibnums FOR R@ fibonacci . NEXT ; ok 9 fibnums 55 34 21 13 8 5 3 2 1 1 ok
By the time I wrote these lines, I noticed that the STM8EF sources already contain a word "CALL,", which is actually implemented as ": CALL, CD C, , ;". However, due to a bug in the original sources it appeared as "CALL" in WORDS. This is fixed now. I also decided to replace "I" with "R@", which is idiomatic in eForth.
One hack more (recursive programming on a humble µC), and one bug less (pushed to GitHub). -
Progress, refactoring, and keeping the docs up-to-date
10/17/2016 at 18:30 • 0 commentsSpending all the time on implementing new features is tempting. Experience teaches that a sizable amount of time has to be used for refactoring the code (delete cruft, rewrite code before even I struggle to understand the original intentions). That's well understood, since that's what makes code different from a quick hack.
However, the same applies to docs: they often start their life as snippets of informations on a scratch pad during a hacking session. Later, at least half of the information is no longer true, or incomprehensible even to the author. Note to self: don't underestimate the time necessary for refactoring the docs ;-)
This said, the docs in the GitHub Wiki got some love. It's never enough, I know.
-
Makefile creates hex-files for all target boards
10/15/2016 at 22:31 • 0 commentsThe latest rewrite of the Makefile creates the hex-files for the target boards "CORE", "MINDEV", "W1209", and "C0135" by running a make.
If you want to build and flash just one target, e.g. W1209, instead run
make BOARD=W1209 flash
By the way, I also published a snapshot of the hex-files for the boards mentioned above in the files section. For first tests, e.g. with the STM8S003F3P6 "minimal development board", all you need is an ST-LINK V2 adapter, and a serial interface with "TTL" level.
-
Improved access to board docs
10/15/2016 at 11:24 • 0 commentsThe best documentation is useless if no one finds it. I rewrote the project description to make the docs on GitHub more accessible.
-
STM8: writing time critical code
10/14/2016 at 18:50 • 0 commentsThe STM8S architecture uses a 3-stage pipeline, and the length of instructions varies from 1 to 5 bytes with an average size of 2 bytes. A fetch from Flash gets 32bit (4 bytes) in one clock cycle, and because of this the instruction length has little impact on the execution time. However, when the pipeline needs to be flushed, the time for code execution depends heavily on the location of instructions relative to a 32bit boundary: in close "spin loops" I observed execution time changes of more than 20% when I moved a routine by one byte!
This said, the performance of code in RAM *) isn't as good as from Flash, since the code has to be fetched byte-by-byte instead of 32bit at a time. On the bright side, the runtime is independent from the location.
So, when testing Forth code in RAM don't expect it to perform in the same way in Flash memory. My advise is to always use a hardware timer for time critical code on the STM8S!
*) Code execution from EEPROM leads to hardware reset. An earlier version of this log wrongly assumed that RAM and EEPROM have the same execution properties!