Close
0%
0%

SPI4C

It's not SPI, I2C or I3C, it's another protocol with full duplex self-clocked links with 2×2 wires that is rich and easy to implement.

Similar projects worth following
Good old SPI and I²C may be replaced by I³C one day but that newer shinier protocol is incredibly complex.

SPI4C (name subject to change once I come up with something better) is simple, easy to bit-bang or encode in logic circuits, it can adapt to any bandwidth so it can also replace async serial coms and inherently avoids baud rate auto-negociation and jitter prevention (thus reduces buffering/FIFOs).

Like SPI it uses 4 wires but no clock signal. Unlike I2C it uses only single-direction signals so no analog stuff to integrate in the protocol. You could transmit it over 4 traces (+ground) on a PCB or over a 4-pairs RJ45 cable for longer distances (and deal with the termination yourself).

It's not a high-speed protocol but could be used for rich/interactive protocols such as debugging, device setup/configuration/status/update etc. For example: negociating and setting communication/configuration parameters for a much faster link (PCIe-like ?)

Warning: there is still a flaw with the connexion establishment protocol.

 
-o-O-0-O-o-
 

This protocol is symmetrical so it could be used for low to medium speed communication, for example user interfaces (keyboard/mouse/terminal), sensors, ... Or simply to link 2 CPU or MPU (most likely) even if they have no special HW/peripheral : 2×2 GPIO and you're rolling!

It uses 4 wires so it is comparable to SPI but the overall protocol can be richer (almost arbitrary) and bidirectional, since framing is controlled by each peer.

Compared to async serial ("RS232") :

  • can be bit-banged with a basic microcontroller or even a computer (through the keyboard's LEDs ?)
  • 4 wires (+GND) for everything, data as well as ACK
  • no need for synchronisation, common clock, or speed negociation
  • no need of deep FIFO

versus I²C :

  • no funky FSM (that always gets a detail wrong)
  • no damned analog trickery or limitation from open-collectors and pull-up resistors
  • full duplex

Versus SPI :

  • messages can be sent by any peer at any time, with arbitrary framing
 
-o-O-0-O-o-
 

Logs:
1. Basics
2. Higher level
3. Service messages
4. Basic receiver circuit and jitter tolerance
5. Basic sender and receiver
6. ACK ACK!
7. Software version
8. Finite State Machine
9. It's a ACK
10. Enhanced protocol
11.  
.

send-rcv.cjs.txt

CircuitJS source for an emitter and receiver. https://tinyurl.com/2n2wlsv5

plain - 5.50 kB - 08/26/2022 at 19:25

Download

  • Enhanced protocol

    Yann Guidon / YGDES08/30/2022 at 18:21 0 comments

    The last log stated a few hard truths that were not considered in the first iteration of the protocol.

    Let's now consider how a frame is defined: it is a run of data trits followed by an ACK trit. All are acknowledged by trits from the peer, their values could have meaningful data, that are unrelated to the frame (or not, or whatever, it's a trit, so it works for the self-clocking scheme). This is not enough if the ping-pong must cease when both peers have nothing left to transmit. So let's think about how to agree on stopping, then from there we'll see how to restart later.

    As noted before, a new frame can immediately follow another so one ACK can't stop the ping-pong right away. Two can, though, but if you can count to two, you can count to 3 with as many DFF and few gates. So if the peer is the last to send data AND has already sent 3 ACK, then it may stop the ping pong by not aknowledging the peer's ACK to the ACK to the ACK to the.... The other peer (which did not send any data) simply ACKs everything. This leaves the link in an unambiguous "well known state". Of course, if the peer receives a data trit instead of a ACK during the last 3 cycles, this means that the other peer will stop by itself, and this peer inhibits its own ACK inhibition. The ping-pong ensures that the state of the whole system is known at all times, and coherent, and easily deducible from the known information.

    If we know the state when the ping-pong stops, and the conditions of stop, then we can safely restart it. Though there will also be cases when it is unknown and the race conditions must be considered, in particular in the cases when both peers want to resume talking at the same time. That's where things become tricky but just follow the reasoning...

    IF the protocol resumes pingpong with another ACK, there is an uncertainty because the receiving peer (which might have sent a request right at the same time) can't know if the ACK is in reply to its own ACK. It might even mistake or even miss the ACK if the timing is too tight for one of the peers, for example by sending the ACK right after the last ACK that closes the sessions. In that case, it's like the receiving peer didn't see the 3rd ACK and the 4th looks like the 2nd. The ambiguity is strong here!

    So the solution is to resume not with a ACK but a data trit, which removes the ambiguity.

    But this only pushes the ambiguity further because then it becomes a temporal ambiguity: which peer is the first and should take the initiative ? If both peers wake up during the same sampling window, there is nothing to distinguish them because they are absolutely symmetric. They should be considered identical and mirrored so if one peer does something, the other does it too. Except for the clock but we can't reasonably wait until the clocks get significantly out of phase, the matter should be solved in one or two cycles ideally...

  • It's a ACK

    Yann Guidon / YGDES08/30/2022 at 07:09 0 comments

    I forget exactly when I imagined this protocol but that was at least 10 years ago, Hackaday.io didn't exist and USENET was already a shadow of itself... Without a practical use case, I had implementation thus no feedback, no challenge or interaction about my "ideas". Today I can go further because I can exchange with more people! Paul's comments made me dig further and that's how I saw a flaw in my initial scheme.

    The last logs said "don't ACK an ACK to prevent endless ACKing" because that would overload the link and increase power draw. Well, it was not a good idea because that would block/stop the link and I didn't simulate this case in my head. So I did more head simulations and here are the results.

    The first issue is that the link should minimise bit toggling (to save on power and EMI etc.) but the raw protocol relies on each peer to reply as fast as possible to let the other send its own data or ACK. If one stops, the other can't talk either (until timeout). But if each ACK is replied with by an ACK then the link is overwhelmed.

    Let's imagine this case: Peer1 sends a frame, Peer2 has nothing to say and simply ACKs, so Peer1 receives only ACKs. When Peer1 is done with the frame and sends ACK to close it, Peer2 will not reply ACK with ACK. This blocks Peer1 until Peer2 timeouts or has anything to say, which is undesired, to say the least...

    What if Peer1 had another frame to send just after ? So let's modify the rule and count how many ACK are replied. Let's say "don't ACK more than 3 ACKs in a row"  to unlock the situation, as it fits in a 2-bit saturation counter easily (2 DFF and a few gates). This solves this special case but not the whole problem.

    The worst case will be a 1Hz update, depending on which peer timeouts first and toggle an ACK to keep the link open/active. This is an unacceptable wait in practice but still works as a watchdog or failsafe "just in case".

    The following discussion primarily applies to the above requirement to minimise both latency and useless ACKs though otherwise this remains totally valid so let's go and state those obvious facts:

    1) Initiating

    There will always be one peer that initiates the protocol's pingpong. Or re-initiate.

    2) Violation

    Because the ping-pong has to start in some way, the initial protocol must be violated. Some peer must take the initiative and this breaks the symmetry.

    3) Race conditions

    Since violations will necessarily occur (though not necessarily often but it must be addressed) then race conditions will occur at the edge of the protocol.

    There.

    From there we can unwind these assertions and see that if we can solve the problem of the race conditions, then it is safe to (re)initiate the protocol. Note that the race conditions can not be avoided, at most reduced, but if they can be detected and managed, then it's good. Now we have to identify the edges of the protocol.

    So far a frame is defined by a run of data bits then an ACK trit. There can be any number of ACK, no problem.

    The initial "perfect ping-pong" assertion ensures that there is no violation or race condition as long as the traffic continues. But this traffic must stop when "data to send" is exhausted. So it must be reinitiated before the timeout, which itself is by definition... asymmetric because we can't know which peer will restart first and, in the worst case, they can start exactly at the same time, which is increased by the enlarged sampling window that compensates for variations in wire length/capacitance/propagation...

    So a safe link establishment must be created, which takes both race conditions and sampling issues into account.

  • Finite State Machine

    Yann Guidon / YGDES08/28/2022 at 14:23 4 comments

    Paul's comments made me realise that what I thought was obvious, well, wasn't. One has to carefully read the chronograms to understand the signals and their sequencing. After a while, I found a much better representation and use it now as the project's avatar:

    Trits are transmitted by changing the line's state. If you send a data bit, you move along an edge and change one output line.

    If you send ACK then you change both lines, which translate in this FSM by moving across the diagonal.

    There is no "idle" state, just the current state... It's some pretty simple signaling, right ?

    All the rest is derived from this.

  • Software version

    Yann Guidon / YGDES08/27/2022 at 20:21 0 comments

    The protocol is easy to implement in logic gates but also equally so in software. This could be life-saving if you must let two CPU or MPU communicate without dedicated interface and with bidirectional messages. This is possible with RS232/asynch serial for example but this is limited to 8-bit streams in practice, so framing is not inherent. MPUs usually implement SPI master features that are exploited. I²C is also rarely a slave device. And asynchronous communications usually require hard real-time constraints.

    SPI4C uses more pins but none of them require special hardware, and no hard real-time constraint is required. A CPU or MPU can bit-bang the protocol as a part of an event loop, thus with some jitter but this is not critical.

    The protocol is symmetrical so a single code/algorithm works on both peers. The needed resources are :

    • A timer that can last about 1s (plus or minus a few potatoes)
    • GPI : General Purpose Input bits :  GPIa and GPIb are 2s contiguous bits ideally (to ease coding but not necessary)
    • GPO : 2 General Purpose Output bits GPOa and GPOb

    The algorithm also needs some variables per link :

    • state_out : 2 bits, copy of GPO
    • state_in: 2 bits, copy of GPI
    • input_buffer : bytes received
    • output_buffer : bytes to send
    • data_to_send : number of bits to be sent from the output buffer
    • bits_received : number of bit already received
    • polarity : 1 or 0, if GPIa and GPIb are swapped
    • link_ok : status flag (1 when communication is deemed working)

    The status link_ok changes under these conditions :

    • goes to 0 during initialisation or when the timer expires
    • goes to 1 when the protocol receives a new trit

    From there we have 3 entry points :

    • Init() :
    • read state_in = GPI
      read state_out = GPO (if applicable)
      Send ACK : GPO = ( State_out ^= 3 )
      trigger_timer( 1 second nominally )
      data_to_send=0;
      polarity=0
      link_ok=0

      Nothing incredible here but we see that a copy of the GPO and GPI are kept, for faster operation and to compare with incoming data. We also see how to send a trit : XOR the state_out with either 1, 2 (data) or 3 (ACK).

    • Timeout() :
    • link_ok = 0  (yeah it's a timeout so the link is now down)
      flush_bit_buffer()  (flush any remaining pack that was not framed by ACK yet)
      bits_received=0
      
      if (data_to_send)  (just in case : close the current frame,
                           and tell the peer that we're alive)
         data_to_send = 0
         send ACK : GPO = ( State_out ^= 3 )
           (beware if peer missed the last data bit and sees
           the ACK anyway, it would be interpreted as another bit,
           so ACK must be sent after closing/init here)
      
      trigger_timer(1s)

      Here there is a little issue to shield from : if something was missed from either peer, both may want to end mostly at the same time, but they could have unfinished frames and they eventually could miss the ACK.

    • new_trit() :
    •   read GPI
        if GPI != state_in :
          link_ok = 1
          re-read GPI (confirm the value !)
          trit = GPI ^ state_in
          state_in = GPI (to compare later)
      
          if trit == 3 {      (handle ACK)
            if (bits_received)   (end of frame ?)
              flush_bit_buffer()  (call user function/hook)
              bits_received=0
          }
          else {
            data=(trit & 1)^polarity
            input_buffer[bits_received >> 3] |= data <<  (bits_received & 7)
            bits_received++
          }
      
          if (data_to_send)
            bit_to_send = output_buffer[data_to_send >> 3]
            send data : GPO = ( State_out ^= 1 << (bit_to_send & 1) )
              (if bit=0 then send 01, if bit=1 then send 10)
            data_to_send--
          else
            envoyer ACK : GPO = ( State_out ^= 3 )
          trigger_timer(1s)
      

      Here goes all the meat of the algorithm. This function is called periodically, thus polling the input pins if no "interrupt on change" is available.

      Polling could happen every 10ms for example, then every 100us if activity is detected, or even directly under no-load condition with high traffic.

    The higher levels of the interface will then manage simple buffers containing one frame. I didn't handle buffer overflows, by the way. I must even have messed a counter or two but it's pseudocode, you see the intent and you'll have to adapt to your own system.

  • ACK ACK!

    Yann Guidon / YGDES08/26/2022 at 22:01 0 comments

    More fun with CircuitJS:

    I made a loopback to ACK the ACK. This gives a rough estimate of the latency and throughput. Note that I use 2 different clocks: 7KHz and 10KHz to test clocking issues. I can't see much though with this system and a more elaborate simulator becomes necessary. I wish I could hook/link custom scripts to add custom probes and features :-D But at least I could test the inter-wire jitter.

    What is interesting is that ... This circuit makes no sense !

    It's stupid to ACK an ACK, as it creates this "larsen" situation, self-oscillating... Here it helps create the throughput test.

    What makes sense is to ACK data data

    1. when there is no data to send (but I have no PISO-SIPO yet) or
    2.  when the activity timer expires.

    This is not wired yet.

  • Basic sender and receiver

    Yann Guidon / YGDES08/26/2022 at 19:15 0 comments

    There !

    This is a one-way link, I now have to couple two of them.

    short URL for those playing at home. Source code uploaded at send-rcv.cjs.txt

    I hope that the chronogram makes the basic protocol clearer: each time a trit is sent, one or two data lines change state.

  • Basic receiver circuit and jitter tolerance

    Yann Guidon / YGDES08/26/2022 at 13:21 0 comments

    Thank you CircuitJS !

    Play at home with this link.

    I use the delay lines to simulate imperfection and transmission delays, and also implement sending 0s and 1s. The simultaneity is a very important aspect because a mismatch deteriorates the data. In SW there is little trouble, usually. In HW it's a bit more complex and I'll have to find a better way than the one above.

    It's basically a clock domain crossing issue, right ? At least I implemented a first "buffer" DFF to prevent metastability but this is not totally foolproof. Jitter during ACK can last a whole clock cycle but can't cross cycles, so it's only half perfect.

    If you change the delay of one delay line, the circuit will receive a 0 then a 1, or vice versa. I added a "polarity" selector "in case the data lines are swapped. This can be implemented by a MUX or XOR with a DFF, itself initialised "as it should" and updated by a service message.

    Remember : the above circuit is only a proof of concept and does not address inter-signal jitter properly. But it shows the basic ideas, and I could start implementing a basic sender.

    .................

    This one should work better !

    The tolerance to inter-wire jitter is increased because the data is XORed with 2 clock cycles of delay. The first delay detects "a transition" with an anti-double-trigger loopback (in green).

    Normally we could increase the tolerance by XORing the data at the outputs of the Schmitt triggers, because the signal should not change once one transition is detected, but it's a detail to manage later.

    Here is the source code.

  • Service messages

    Yann Guidon / YGDES08/25/2022 at 19:13 0 comments

    The low-level interface is very very lax and permissive and the frame-length filter helps a lot. Anybody could do anything but even incompatible circuits/peers should be able to discover they are not compatible... So let's agree on a "reserved message length" of 6 bits that requires very little complexity to support, and provide basic discovery and link status stuff.

    First, the 2 leading bits must be 0 then 1. Any 6-bit message starting with a different header will be rejected. WHY ? So the peer can detect that the link works properly (all-1s and all-0s would indicate stuck or open signals) and if the message is validated, then it ensures that the bits are not swapped. A DFF+XOR can then be configured to "virtually" swap the data polarity, just as if we swapped the bits.

    Then we have 4 bits or 16 combinations.

    • 0000 and 1111 are invalid. See previous paragraph to see why. This leaves 14 codes. Not all will be used yet.
    • INIT instructs the peer to reset its various com layers and start afresh/anew.
    • TINI says the peer has been initialised.
    • PING is ... well, a test just like on IP stacks.
    • PONG is the answer. Too bad we don't have enough room to include some arbitrary garbage but that does not matter at this point.
    • WHAT says that the last message is not recognised, for example its length is invalid or the request was invalid.
    • DESC requests the peer to send a frame of variable length with UTF8 encoded text that describes that peer.
    • IDTT requests a free-length frame containing the type+serial number or something like that.
    • CAPS requests a frame where each bit that is set to 1 corresponds to a valid filtered frame (it's free-length too)

    That's 8+2 combinations so there is room for 6 more. Nice. They are general and generic enough and if a peer implements very specific control messages (such as sleep/wake or such), a different length (longer) can be implemented without cluttering this simple protocol.

  • Higher level

    Yann Guidon / YGDES08/25/2022 at 16:01 0 comments

    The low level interface simply transmits and receives trits. It is very simple, a few DFF and XOR gates.

    The higher level can be much more complex, or still simple, depending on the performance, requirements and platform.

    The first thing to care about is the timeout and activity of the link. A slow timer (1Hz) sends a ACK trit (when no activity is detected) and if another trit is detected/received before sending another one (within 1s) then the link is deemed active. The higher levels of protocol can initiate transmission.

    The protocol transmits unbounded strings of 0s and 1s, and the ACK trit is used for framing : the strings are thus cut and distinguished from others, thus making frames or packets. There is no upper bound for the frame size but let's say 4096 bits (2^12) or 512 bytes is reasonable for now.

    A packet's minimal size is 1 bit. Give it the meaning you want.

    So from a higher perspective, the protocol sends and receives strings of bits of variable lengths. The length in itself is useful to simplify protocols, in-band or out-of-band signaling, etc.

    For example, in a hypothetical protocol, data packets would be identified by a length of 256 bits, control packets would have any other length, and the length can filter which type of command is sent, with the data itself being the arguments. Very short packets are possible for IRQ, command acknowledge...

  • Basics

    Yann Guidon / YGDES08/25/2022 at 15:37 0 comments

    The basics of the protocol are simple.

    It is a symmetric, point-to-point full-duplex protocol with 2 binary signals per direction.

    Each peer needs 2 output pins and 2 input pins, no direction reversal, so each pair of wires can be dedicated and designed easily, with much fewer analog tricks.

    Signaling uses "transition" encoding, and since there are 2 wires, 4 symbols, only 3 values can be transmitted, let's call them "trits".

    Each trit is built from the XOR of the previous line value, sampled exactly at the same time in the pair.

    •  For example, to send a trit to a pair of GPIOs, one takes the previous output state and XORs the trit before sending the new value to the GPIO.
    • To receive a trit, sample both GPIO pins at the same time, and XOR the new value with the previous value => you get a trit.

    Encoding is simple :

    00 => idle / no trit
    01 => encode bit 0
    10 => encode bit 1
    11 => ACK / end of frame

    One should be careful to not swap the wires but this has the benefit that a "receive-only" peer can save one wire, by sending one bit only and the value is cloned/duplicated to emulate a full-duplex device. So the protocol can work with 3 wires in certain cases.

    Self-clocking works with the simple ping-pong idea : One peer sends a trit, and waits for the peer's trit in return. No crazy encoding is performed with base-3 values : the 3rd value is an "end of frame" that eases decoding and framing. This also double as an ACK signal when the peer has no data to send. Since ACK is encoded as 11, the 3-wire configuration is possible.

    Also, binary transmission consumes half the energy because there is only 1 transition instead of 2 for ACK. But ACK can be delayed during idle periods.

    Each peer announces its availability on the link by sending ACKs periodically, like 1 per second. Link is "established" if a ACK is received within 1 second or so, to let the peer "wake up" for example. So the slowest speed is 1 baud. Highest speeds might be in the MHz range, depending on the link's length, drive, signal edges/slopes, circuit/SW latency...

View all 10 project logs

Enjoy this project?

Share

Discussions

Yann Guidon / YGDES wrote 09/18/2022 at 15:31 point

I looked again at the RPi's GPIO and... sigh.............

It only allows setting or clearing bits, not toggling, so the pins can't be changed simultaneously. This can create some problems...

I think there is also a parallel interface somehow/somewhere and that would be better.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/27/2022 at 23:04 point

I'm looking for a better name for this project, please suggest !

  Are you sure? yes | no

Paul wrote 08/26/2022 at 02:59 point

A couple questions:

1. Is every bit acknowledged by the reciever? it seems like this is true, as part of the self-clocking scheme.

2. Is the following scenario correct?:
Device A is sending a bit, and device B is receiving a bit.
Initial line states are both idle (00)
Device A sets 10 or 01 on its output
Device B, when it sees the rising edge on either of its input lines, samples both lines, then uses the state of the lines to determine the bit transmitted, XORs the internal bit with the value indicated, and sets ACK (11) on its own output, which is the input to Device A
Device A sees the rising edge of either of the input lines and samples the input lines. It recognizes the ack and allows its output to return to idle (00).
When does Device B return to idle (00) output, from ACK output acknowledging A's transmission? should it detect the negative edge of A's return to idle, and return to idle then?

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/26/2022 at 11:27 point

Hello,

1) yes. It's dumb but effective, it's "reciprocal destination-sourced clock" instead of "source-clocked" as in SPI/RS232/I2C. Each "trit" has their own ACK which could be data as well.

2) ouch, I'll need some coffee to digest that... Or better, write another post with chronograms.

There is no "idle" binary state, idle is when there is no activity and the state could be anything.

I'll try something with fasltad then :-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/26/2022 at 19:54 point

Hi Paul !

I hope my latest logs explain a few things.

https://hackaday.io/project/187000/log/210204 has real chronograms, not just "intended function", it's the actual circuit.

Now I have to copy-paste it and make the automatic ACK generation, then a PISO-SIPO system...

  Are you sure? yes | no

Paul wrote 08/26/2022 at 20:27 point

I think I see the function much more clearly now, thank you. It's a very nice protocol!

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/26/2022 at 21:28 point

@Paul Thank you !

I don't use it, it's not well developped, I have no use case so it lingered in my brain's crypts for a long time. It's not patented so maybe it could become an open niche protocol, if somebody finds the killer application :-D

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/26/2022 at 20:22 point

Oh I see where your interpretation diverges :

"It recognizes the ack and allows its output to return to idle (00)."

ACK or even a data bit just say that the peer can send another trit. There is no return to "idle/00". This is possible with the XOR with the last state. It's a bit more efficient than what you have written :-)

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/28/2022 at 13:36 point

I believe the new project avatar makes it even clearer now :-)

Data and ACK are transitions in a Finite State Machine...

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/25/2022 at 16:10 point

That idea/protocol/project is so oooold.... at least 10 or 15 years now. I can't remember well. I did not have a good use case for a long time, and I implemented only some ideas for an emergency/last resort link in a GPIO-contrained situation where 2 RPI needed to be synchronised to a few milliseconds...

Today for raw debugging I use another protocol/link (see the #YGREC8 project) that reuses an integrated SPI serdes.

  Are you sure? yes | no

Yann Guidon / YGDES wrote 08/25/2022 at 15:09 point

https://hackaday.com/2022/08/25/i3c-no-typo-wants-to-be-your-serial-bus/#comment-6506321 damnit...

  Are you sure? yes | no

Does this project spark your interest?

Become a member to follow this project and never miss any updates