Everybody loves a video so here's one to start with before we get to the boring fascinating details.
There are many protocols out there for communicating between computers, and other computers or peripherals. A subset of these are common enough to be supported by dedicated hardware, e.g. I2C, SPI, UART. New ones are popping up all the time as designers invent ways to communicate with the minimum of hardware and lines. An example is the Neopixel protocol. An example of a uncommon protocol is TMP, the serial protocol used by the Titan Micro family of display driver that resembles I2C but is a cut-down lookalike (that incidentally didn't require Titan Micro to get a I2C address allocation). What do you do when the protocol you want to implement on your MCU doesn't have hardware support?
There are various ways to tackle the problem. One would be to select a different MCU that has the support. Another is to add a peripheral chip that implements the protocol. This is a traditional path, all those UART chips for MPUs are implementations before the functionality was absorbed into MCUs. Similarly for the USB protocols, these are beginning to be integrated into some MCUs. Another way would be to delegate the task to a slave MCU.
The Raspberry Pi Pico family (in which the RP2040 is deployed) has an interesting approach to support for uncommon protocols. This SoC contains state engines that can be allocated and programmed to deal with the protocol, relieving the main processor of this work. It's explained in the Pico SDK documentation, and blogs of how to use this feature are appearing, including this Hackaday #RP2040 : PIO - case study . Expect to see more MCUs implement such state engines.
But what the Pico doesn't suit your other requirements, or if you are too cheap to allocate more hardware to the problem? Surely the MCU has power to spare and you can drive GPIO pins in software? Thus begins your foray into bitbanging.
Drive it fast but not too fast
Unfortunately many of those protocols put one in a bind. Ideally you would like the data transfer to go as fast as possible, but the slave chip may have limits. For example TMP above goes up to 250kHz. So delays may have to be inserted in the code to maintain the minimum timing. In this case a delay of 5 µs is needed each state change. For older microprocessors, a inserted NOP or perhaps a call and return would suffice. But for fast MCUs this busy wait prevents the MCU from doing other work. The wastage gets worse as MCUs get faster. Also unless using a timer, the wait is model dependent. Use a faster member of the family and the waits have to rejigged.
A proper solution takes advantage of the fact that we usually don't need top speed. In fact running it at a lower speed makes the wiring less critical. We only have to run the protocol fast enough. In the case of the TM display chips, it suffices that the transfer time is negligible compared to the update period. A few milliseconds updating the display of a time-of-day clock won't be noticed.
Solution 1: interrupt handlers
Solution 2: polled state machines
Instead of interrupts, your main line may have a polling loop you can use, say one that scans multiplexed displays, which runs from hundreds of Hz to tens of kHz. You could implement a state engine in a function, using persistent variables (e.g. static C variables). The advantage is that there is no resource contention as the multitasking is co-operative. Protothreads are another way to do this in C.
Solution 3: co-routines
Your implementation language may have co-routines or equivalent, like Lua does. Again this can be coupled with a polling loop.
A case study
Let's start with a test program that scrolls through the 10 digits horizontally on a TM1637 display, which is what the initial video featured. Here is the Arduino version using delay calls.
// Module connection pins (Digital Pins)
#define CLK 2
#define DATA 3
static uint8_t startnum = 1;
static uint8_t display[4];
static const uint8_t font[] = { 0x3f, 0x06, 0x5b, 0x4f, 0x66, 0x6d, 0x7d, 0x07, 0x7f, 0x6f };
void start(void)
{
digitalWrite(CLK, HIGH); //send start signal to TM1637
digitalWrite(DATA, HIGH);
delayMicroseconds(5);
digitalWrite(DATA, LOW);
digitalWrite(CLK, LOW);
delayMicroseconds(5);
}
void stop(void)
{
digitalWrite(CLK, LOW);
digitalWrite(DATA, LOW);
delayMicroseconds(5);
digitalWrite(CLK, HIGH);
digitalWrite(DATA, HIGH);
delayMicroseconds(5);
}
bool writevalue(uint8_t value)
{
for (unsigned int mask = 0x1; mask < 0x100; mask <<= 1)
{
digitalWrite(CLK, LOW);
delayMicroseconds(5);
digitalWrite(DATA, (value & mask) ? HIGH : LOW);
delayMicroseconds(5);
digitalWrite(CLK, HIGH);
delayMicroseconds(5);
}
// wait for ACK
digitalWrite(CLK, LOW);
delayMicroseconds(5);
pinMode(DATA, INPUT);
digitalWrite(CLK, HIGH);
delayMicroseconds(5);
bool ack = digitalRead(DATA) == 0;
pinMode(DATA, OUTPUT);
return ack;
}
void writedigits(uint8_t *values)
{
start();
(void)writevalue(0x40);
stop();
start();
(void)writevalue(0xc0);
for (uint8_t i = 0; i < 4; i++)
(void)writevalue(*values++);
stop();
}
void setup()
{
pinMode(LED_BUILTIN, OUTPUT);
pinMode(CLK, OUTPUT);
pinMode(DATA, OUTPUT);
start();
(void)writevalue(0x8f); // for changing the brightness (0x88-DIM 0x8f-Bright)
stop();
}
void loop()
{
uint8_t first = startnum;
for (uint8_t i = 0; i < 4; i++) {
display[i] = font[first];
first++;
if (first >= 10)
first = 0;
}
writedigits(display);
delay(500);
startnum++;
if (startnum >= 10)
startnum = 0;
digitalWrite(LED_BUILTIN, (startnum & 0x1) ? HIGH : LOW); // flash at 0.5 Hz to debug
}
Now let's look at the Lua version of this program. Incidentally I had a small rabbit hole adventure getting this to work. It turns out that bitwise operations only came with Lua 5.3. So I had to rebuild the NodeMCU firmware with the 5.3 branch. I could have used the bit module, but that would have required me to rewrite mask & value as bit.band(mask, value). I wanted to be able to test a large part of the program on my host machine so didn't want to edit the syntax. That accounts for the commented out loadfile("extra.lua")() line by the way.
-- Module connection pins (Digital Pins)
local CLK = 1
local DATA = 2
local LED_BUILTIN = 4
local startnum = 1
local display = { 0x06, 0x5b, 0x4f, 0x66 }
local font = { 0x3f, 0x06, 0x5b, 0x4f, 0x66, 0x6d, 0x7d, 0x07, 0x7f, 0x6f }
--loadfile("extra.lua")()
function start()
gpio.write(CLK, gpio.HIGH)
gpio.write(DATA, gpio.HIGH)
tmr.delay(5)
gpio.write(DATA, gpio.LOW)
gpio.write(CLK, gpio.LOW)
tmr.delay(5)
end
function stop()
gpio.write(CLK, gpio.LOW)
gpio.write(DATA, gpio.LOW)
tmr.delay(5)
gpio.write(CLK, gpio.HIGH)
gpio.write(DATA, gpio.HIGH)
tmr.delay(5)
end
function writevalue(value)
local mask = 1
while mask < 0x100 do
gpio.write(CLK, gpio.LOW)
tmr.delay(5)
if (value & mask) == 0 then
gpio.write(DATA, gpio.LOW)
else
gpio.write(DATA, gpio.HIGH)
end
tmr.delay(5)
gpio.write(CLK, gpio.HIGH)
tmr.delay(5)
mask = mask << 1
end
-- wait for ACK
gpio.write(CLK, gpio.LOW)
tmr.delay(5)
gpio.mode(DATA, gpio.INPUT)
gpio.write(CLK, gpio.HIGH)
tmr.delay(5)
local ack = (gpio.read(DATA) == 0)
gpio.mode(DATA, gpio.OUTPUT)
return ack
end
function writedigits(values)
start()
writevalue(0x40)
stop()
start()
writevalue(0xc0)
for i = 1, 4, 1 do
writevalue(values[i])
end
stop()
end
gpio.mode(LED_BUILTIN, gpio.OUTPUT)
gpio.mode(CLK, gpio.OUTPUT)
gpio.mode(DATA, gpio.OUTPUT)
start()
-- for changing the brightness (0x88-dim 0x8f-bright)
writevalue(0x8f)
stop()
while true do
local first = startnum
for i = 1, 4, 1 do
display[i] = font[first + 1]
first = first + 1
if first >= 10 then
first = 0
end
end
writedigits(display)
tmr.delay(500000)
startnum = startnum + 1
if startnum >= 10 then
startnum = 0
end
-- flash at 0.5 Hz to debug
if (startnum & 1) == 0 then
gpio.write(LED_BUILTIN, gpio.LOW)
else
gpio.write(LED_BUILTIN, gpio.HIGH)
end
end
I miss many conveniences from C, like the ternary ?: operator, but as an embedded language, Lua's not too bad.
Now here's the Lua coroutine version. Notice that all the waits have been turned into coroutine yields so it's expected that enough time elapses before the coroutine is resumed. You also see that the protocol handler is moved into a coroutine which yields often but never exits.
-- Module connection pins (Digital Pins)
local CLK = 1
local DATA = 2
local LED_BUILTIN = 4
local startnum = 1
local display = { 0x06, 0x5b, 0x4f, 0x66 }
local font = { 0x3f, 0x06, 0x5b, 0x4f, 0x66, 0x6d, 0x7d, 0x07, 0x7f, 0x6f }
--loadfile("extra-c.lua")()
function start()
gpio.write(CLK, gpio.HIGH)
gpio.write(DATA, gpio.HIGH)
coroutine.yield(true)
gpio.write(DATA, gpio.LOW)
gpio.write(CLK, gpio.LOW)
coroutine.yield(true)
end
function stop()
gpio.write(CLK, gpio.LOW)
gpio.write(DATA, gpio.LOW)
coroutine.yield(true)
gpio.write(CLK, gpio.HIGH)
gpio.write(DATA, gpio.HIGH)
coroutine.yield(true)
end
function writevalue(value)
local mask = 1
while mask < 0x100 do
gpio.write(CLK, gpio.LOW)
coroutine.yield(true)
if (value & mask) == 0 then
gpio.write(DATA, gpio.LOW)
else
gpio.write(DATA, gpio.HIGH)
end
coroutine.yield(true)
gpio.write(CLK, gpio.HIGH)
coroutine.yield(true)
mask = mask << 1
end
-- wait for ACK
gpio.write(CLK, gpio.LOW)
coroutine.yield(true)
gpio.mode(DATA, gpio.INPUT)
gpio.write(CLK, gpio.HIGH)
coroutine.yield(true)
local ack = (gpio.read(DATA) == 0)
gpio.mode(DATA, gpio.OUTPUT)
return ack
end
function writedigits(values)
start()
writevalue(0x40)
stop()
start()
writevalue(0xc0)
for i = 1, 4, 1 do
writevalue(values[i])
end
stop()
end
co = coroutine.create(function()
coroutine.yield(true)
gpio.mode(LED_BUILTIN, gpio.OUTPUT)
gpio.mode(CLK, gpio.OUTPUT)
gpio.mode(DATA, gpio.OUTPUT)
start()
-- for changing the brightness (0x88-dim 0x8f-bright)
writevalue(0x8f)
stop()
while true do
local first = startnum
for i = 1, 4, 1 do
display[i] = font[first + 1]
first = first + 1
if first >= 10 then
first = 0
end
end
writedigits(display)
startnum = startnum + 1
if startnum >= 10 then
startnum = 0
end
coroutine.yield(false)
end
end)
ledon = false
while true do
cont = true
-- tick rate = 1kHz
for i = 1, 1000, 1 do
if cont then
status, cont = coroutine.resume(co)
end
-- in real life do other work and wait for next tick
tmr.delay(1000)
end
-- flash at 0.5 Hz to debug
if ledon then
gpio.write(LED_BUILTIN, gpio.LOW)
else
gpio.write(LED_BUILTIN, gpio.HIGH)
end
ledon = not ledon
end
When run on the ESP8266 this works almost like the previous version. At the tick rate of 1 kHz, it takes a couple of hundred yields to complete the transfer so you can see the update move across the digits. Increasing the tick rate will reduce this effect, but it's not unpleasant. Notice that the last yield returns false as a flag that the bitbanging is complete. The other side effect is that due to the additional time for the coroutine to handle the update, the flash rate is less than 0.5 Hz. In practice one would not delay in the main loop, but do other work then wait until the second is up by watching the timer. Also in a real program the protocol handler would not update the display but leave this to the main line. Changing this would have made it harder to compare with the previous version.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.