Close

more details.

A project log for YGREC32

because F-CPU, YASEP and YGREC8 are not enough and I want to get started with the FC1 but it's too ambitious yet.

yann-guidon-ygdesYann Guidon / YGDES 08/17/2024 at 01:440 Comments

The Y32 architecture is mostly divided in four essential blocks, each with the dedicated memory buffers:

  1. Glob1 (with dCache1)
  2. Glob2 (with dCache2)
  3. eBTB (with iCache)
  4. control stack (with stack cache)

A 5th block (with no memory access or array) processes data for multi-cycle and/or more than 2R1W operations (Mul/Div/Barrel shifter/...)

Yet the YGREC32 is a 3-issue superscalar shallow-pipeline processor. This directly dictates the format of the instructions:

.

The globules are very similar to the YGREC8 : a simple ALU (with full features, boolean, even umin/umax/smin/smax), 16 registers with 8 for memory access, and a 32-bit byte shuffle unit (aka Insert/Extract unit, for alignment and byteswap: see the #YASEP Yet Another Small Embedded Processor's architecture).
It must be small, fast : Add/sub, boolean and I/E must take only one cycle.

Complex operations are performed by external units (in the 5th, optional/auxiliary block) that get read and write port access with simultaneous reading of both register sets to perform 2R2W without bloating the main datapath: multiply, divide, barrel shift/rotate... The pair of opcodes get "fused" and can become atomic, but could also be executed separately.

Fused operations require an appropriate signaling in the opcode encoding.

For context save/restore, it would be good to have another, complementary behaviour, where one opcode is sent to both globs (broadcast).

It is good that control operations can be developed independently from the processing operations, though sharing the fields (operands and destinations) is still desired.

.

Both globs are identical, symmetrical, so there is only one to design, then mirror&stitch together.

Each glob communicates with the other through the extra read port (the register sets are actually 3R1W), and are fed instructions from the eBTB. The only connection with the control stack is for SPILL and UNSPILL, using the existing read & write ports. Each glob provides a set of status flags, which are copies of the MSB and LSBs of the registers (plus a zero flag). Hence some of the possible/available conditions are

The multiple LSB conditions help with dynamical languages that encode type and other ancillary values in the pointers (including #Aligned Strings format ). MSB can be a proxy for carry.

Oh and yes, add with carry is a "fused"/broadcast operation that writes the carry in the secondary destination.

Discussions