We live in a society
In a perfect world, you would be able to get any technical data with a snap of your fingers, download firmware for your kettle and install a custom boot logo for your smart fridge, reflow an MCU with a bigger RAM to your roomba and install a chatbot into your doorbell. Alas, the society we live in likes secrets. If you buy a Chinese car battery charger and, while trying to add a remote data logging functionality, break one pin of the main MCU off, while trying to solder a jumper wire to it, you will lose the display functions of that charger. Though the replacement MCU is readily available for purchase it won't help you because you don't have a firmware for it.
You can imagine my disappointment when I found myself in exactly that situation. Well, I can read the firmware off the broken MCU, can't I? I had a suitable programmer and after soldering a programming header to the pads kindly provided by the manufacturer I pressed a "download firmware" button, but that didn't work. You see, keeping secrets is a lucrative business and if you provide some "security" functionality in your product, people are more likely to choose it over the competition. Thus every MCU on the market nowadays has some kind of "read-out protection" or ROP for short. That's just a bit/byte flag somewhere in the settings section of an MCU flash memory that, if is set during programming, prevents any external reads of the memory. ROP can be turned off by setting that option byte to 0, but that will automatically erase the whole flash - not what we want at this point.
If you think about it, the only thing more profitable than keeping secrets is uncovering them by breaking any means that sustain the secrecy. And so some clever men thought of a way to circumvent the ROP on an MCU. What if, they thought, we just briefly (very) turn the power off and on again? But in the exact moment when the chip reads its settings to understand how to behave. If we do this fast and precisely enough, the processor won't reboot but will read ``0`` where would be ``1`` and let us download everything it has stored in its flash. I don't remember and can't be bothered to search which processor was the first to experience this sort of attack that was documented and published, but after that many more other models were successfully broken in and it's not hard to find detailed write-ups about the exact model you are dealing with. And that's exactly what I did.
Research
I searched if STM8S005K6T6C can be glitched into a submission. And the answer was YES. There are many different models of STM8s but the general consensus is that most of them can be dealt with (yet to be proven). At that time I had no experience or specific knowledge other than watching a few presentations on youtube. But I was encouraged by the blog post I've found. The apparent simplicity of the described procedure inspired me to try to recreate the success with my own hardware.
Development
I've designed a simple PCB that broke out necessary connections to the MCU and contained mosfets for VCC, GND, and a mosfet to pull the VCAP pin to GND - that was the proposed way to glitch this MCU. To be able to change the voltage of the VCC if needed I also added a step-down voltage converter module. Also also I needed something to produce the glitch itself. Most hackers who do this kind of stuff go big and use some FPGA for that, but I never used those and didn't want to deal with it for such a minor (yeah sure) project. Other people used anything from arduino to 555 timer to a tumbler switch and were saying that you can do it with enough incentive. But the general suggestion was to use something that can switch GPIOs fast and keep timings tight. I decided to use the RP2040 board for this as I heard that it was capable of both of those things.
I've ordered two new chips from aliexpress, a Waveshare RP2040-Zero, and etched a PCB. And then let it lay in a box for one and a half years.
Usually, I don't finish most of my projects, but as a challenge and a source of dopamine, I try this from time to time. Also, I really wanted to be able to see my battery stats while I was charging it (and using a charger in a semi-disassembled way is not something that you find often suggested among people with a degree in common sense).
Research. Take 2
It's interesting how much you can forget while working on different projects and also it's amazing how much information you can miss reading the article even 3 times. So I started reading and researching anew to see if there were some more specific instructions I could have missed the first time I read it. That's the gist of it:
- most of the STM8 variants have a VCAP pin which is an output pin of an internal voltage regulator that provides the core with 1.8V. The pin itself is used to connect a filtering capacitor and without a cap, MCU simply can't work
- the idea is to pull VCAP to GND briefly and produce a glitch at the time when option bytes (that has one for ROP) are read
- you need to have a SWIM capable programmer. STLINK-V2 clones will do, or you can try to build your own esp-based SWIM programmer.
- you will use the RESET pin of the programmer going HIGH as a starting point for a countdown to an actual glitch time
- the ROP byte is read once at a boot time and then flash is either set readable or not, so you only need to succeed with a glitch once
- some STM8 variants have a built-in UART bootloader. The bootloader is disabled when ROP is set, and also the bootloader is disabled by default, so using it to validate the glitch isn't practical
- to validate the glitch you can read the first byte of a flash and test it against a known value. On some variants when ROP is set all flash reads are ``0`` on the others they are ``0x71``
Determining timings
STM8 is a pretty low-end chip and it's not very fast in today's terms. Normally it runs at 8-16MHz but if you read a datasheet you can find the mention that it runs at only 2MHz at the boot time when it reads option byte we are interested in. Also STM8 core takes 1-2 cycles for the most instructions so we can assume that our target glitch pulse should be 500-1000+ ns. To succeed you also need to know the exact time when to inject the glitch and there are several ways you can get this information. The author of the aforementioned blog post connected the VCAP line to an oscilloscope and reduced the size of a capacitor to be able to detect voltage drop that would indicate the read operation (if we assume that the first thing our MCU does is reads option bytes and EEPROM). Also, there is info in the datasheet that ``Reset pin release to vector fetch`` is a maximum of 150us. On the other note, when I was struggling with the glitch and tried anything that came up in my mind I ran a test with a simple firmware and measured the time from RESET pin set HIGH to firmware running at ~930us and that's the time it takes to get to the main loop.
Development (continues)
I was going to need software that would do the glitch. And as I was not sure about the timings and other aspects of it I wanted to have the flexibility to experiment and find what exactly works. I decided that I will have a simple firmware on an RP2040 that will listen to a UART command with exact settings for a glitch and then time it after a chosen delay when it sees the RESET line go HIGH. RP2040 will be connected to a PC that will have a python script running that will send a command to an RP2040 with different glitch settings and then to validate the success of it will run a stm8flash to read out the first byte of a flash. If it's not ``0`` it will then repeat the glitch with found timings and try to read full firmware off the target. This way I could easily bruteforcefully find the exact timing for a delay and glitch pulse itself.
Before I dived deeper I thought that precise timing would be crucial, and decided to do my best to get it off the list of what could be wrong if things wouldn't go as planned. For this task, I used PIO of RP2040 that can be used to time precise delays with a resolution of 1 cycle with a minimum length of 2 cycles. Also, I set the clock of my RP2040 to 200MHz to get a nice 5ns resolution.
Road to Success
I wired everything up, set initial values for delay and pulse duration, launched the python script and it went brrr...so fast, I thought. It checked all possible combinations with 50ns stepping in both delay and glitch duration. It took maybe 10 minutes but without any results.
Okay.
Long story short - I've spent the next 3 weeks trying different values, and fixing my code and hardware setup. Some key points:
- Datasheet suggests a 1uF capacitor on the VCAP line, so most of the targets you will find will have something close to that value, to be able to pull VCAP hard enough you need to change that cap to 100nF (0,1uF). For some reason, my target already had a capacitor of this low value.
- Some people suggested raising the VCC voltage to 3.6V and said that this allowed them to achieve stable glitching. My variant is 5V tolerant, and in my tests VCC voltage changed almost nothing. But it may vary with different STM8 models.
- If you read option bytes region with ROP enabled you won't get actual ROP value (you either get all ``0`` or ``0x71`` or some other random value for every byte of the flash as I mentioned before). But you can read the ROP value from your firmware and spit it out via UART for example. This is very fast - from RESET HIGH to the first characters in UART it takes less than 1ms. So if you want to run tests on your blank chip, you can use it to crunch through many many tries per minute and find the timings that work and then with the known good settings try to glitch the actual target.
- There is an observation that you actually need to bring VCAP to ~1V for a successful glitch. Lower - it will crash the chip higher (above 1.2V) and it won't do anything.
- ^ thus you would want to add a resistance in the path from VCAP to GND. Also, you will probably need to find a suitable mosfet.
- You can think that you are glitching the core when you're actually not. So if you pull VCAP too hard the processor will reboot. If it reboots during SWIM operation stlink will detect it and stm8flash will exit with ``SWIM Error 0x04``. This is important, you want to achieve this to see where are the limits of your glitch duration or if your glitch does anything at all. If you are running your setup and have never seen a SWIM error, you're probably not pulling VCAP hard enough (if at all).
- ^ if you have a scope, you can just check if it goes to 1V-1.2V. I don't have a scope, so I had to look at secondary indicators.
- You may need to remove a capacitor that is present on a RESET line (as the datasheet suggests). I couldn't get a clear trigger until I removed it and thought that my code was bad.
Despair
I left my setup running for 10-20 hours straight, I went to sleep and woke to look at numbers running up in a terminal window. It just didn't work. I couldn't get any results while I was pulling VCAP to GND with a mosfet. Any glitches longer than 600ns just rebooted the processor.
I thought that maybe my version of STM8 was not as susceptible to glitches as the one that was reported working. I couldn't think of anything I haven't tried, and so I decided to finally ask for help, or at least some suggestions from people who have done it before here: https://github.com/rumpeltux/itooktheredpill/issues/2
Luckily there was one person who seemingly worked their way through the same process but had an idea that I haven't considered. szakalit said that he was able to glitch easily after he implemented a circuit to switch VCAP voltage between 1.8V and 1V. At first, I didn't understand it but after a few questions, I learned that you can DISABLE INTERNAL REGULATOR.
More development
On some variants of STM8, you have 3(!) voltage input pins. So if you disconnect the VDD pin the internal regulator that converts 5V(3.3V) to core's 1.8V will stop working. And then you can just supply external power to a VCAP pin and it will work! I guess there is some threshold on the regulator itself, that gets triggered and it either switches off trying to prevent a short, or maybe it even triggers brown-out detection on the core itself and that's why it's hard to glitch while relying on an internal regulator.
szakalit used a TS5A3159A analog multiplexer to switch between two power sources of 1.8V and 1V. I couldn't find any analog multiplexers in my stash and didn't want to wait for it to come from a supplier, so I made a simple switching circuit from two N-channel and one P-channel mosfets: https://tinyurl.com/279rbyw5 And I used two small buck converters to produce needed voltages from 5V input.
I started the test with a 50us delay and 1150ns glitch pulse, and after only a few tries, I had a successful glitch!!! I couldn't believe it. It was THAT simple. I repeated the test a few more times to check if it wasn't a fluke, and indeed, it wasn't.
Now I had to repeat this with an actual target. I connected everything to the battery charger according to a revised scheme aaand it didn't work.
Again
I don't know if I was more tired or pissed at this point. How can identical chips behave so differently?! This one (the target) just crashed after first glitch and I couldn't get any communication via SWIM with it. I thought that maybe some components on the board messed with my procedure, so I carefully transported the chip that now had 3 pins missing to a test board. This way it should work. NOT. It was behaving the same way.
At one point I thought that I completely bricked the chip and that this whole endeavor was futile. Then I was relieved when I realised that I had to toggle power to VDD pins after I get SWIM error (and there were ALOT of them). After turning it off and on again, the core reset and would talk to SWIM again, until it got stuck once more.
Have I mentioned that I was pissed? Yeah, so the best way I came up with at that moment was just to manually wiggle the dupont wire resetting the chip. You have to be either smart or stubborn. At this point, it's clear which one I am. I successfully wiggled a wire (not once but thrice) to read the firmware of that chip.
The next day I remembered that I have a mosfet on the VCC line exactly for this cause and after I modified the firmware on an RP2040 I could automate the process of power toggling to ensure that the core doesn't get stuck in a broken state.
So what have I learned at this point (time for some more bullet points):
- Not every chip takes glitches gracefully even when you supply your own 1.8V/1V voltage to VCAP. If it's stuck in a broken state toggling the RESET line is not enough, you need to power cycle.
- Even with automated power toggling after every SWIM error it took far more retries on this particular chip to get to a ROP bypass. I was glad that I had found working timings on a far more forgiving chip, so I didn't need to go through the whole process with the chip that clearly didn't want to cooperate.
- 1V may be the culprit of this awful behavior. I have tested with glitches at 1.1V and 1.2V - they worked without so many swim errors but with a far lower success rate. BUT if you want (need) to glitch above 1V you can (should) increase a glitch duration. In my tests, it seemed that for every additional 0.1V, you need to add ~300ns to a glitch duration. BUT but at 1.3V I couldn't get a successful glitch even with a 5000ns pulse after several hours of tests.
- Unrelated to STM8 itself, but you may want to read all flash regions (OPT, EEPROM, FLASH) at once while you have achieved ROP bypass. My target have something in EEPROM and a newly burned chip with extracted firmware wouldn't work without those magic values in EEPROM.
- I also tried to glitch an STM8S103 (just for research's sake) but it has only one VDD pin, so disabling it disables every other peripheral and it just doesn't work. I wasn't successful with a mosfet glitch either, and frankly, I am burned out at this point to try revising a circuit for it. Maybe if I had a scope...
In the end, I made some changes to stm8flash. I've added a check before a readout to see if reading is allowed. Because it reads in chunks of 256 bytes, and reading one chunk is much faster than the whole flash. And then I've also added a "Read All" mode that will read all flash (if ROP is unset) and save it to 3 different files. With these changes you literally need only one fluke to make it work.
The code for the glitcher itself is here: https://github.com/monte-monte/RP2040-Interactive-Glitcher
I want to mention that I wouldn't even try all this if people before me didn't share their knowledge and success stories. I don't know enough to come up with all the needed theory by myself. So the first thanks goes to them.
Also, I want to thank clever for bringing light to RP2040's PIO, without his help it would be much harder.
Thanks to willmore for general support and advice in the electronics domain.
Thanks to rumpeltux who posted the initial(?) writeup about breaking SMT8's ROP. And creating an active community in the comment section of that post.
Thanks to Jarrett for his post on hackaday where he described doing all of this with a pair of 555 timers (some people even thought he was trolling with that, lol).
And big thanks to szakalit who was nice enough to come back to an old discussion and bring the piece of crucial information without which I probably just threw this all away.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.