After a long hiatus, I've found time to look at this again.
The first problem is still how to interface the builtin routines that do the right shifts that are too hard to do inline. The additional builtin routines expect inputs in particular register (pairs) and leave the output in a particular register (pair). However the code leading up to the call and after the call don't know this. It doesn't seem to be a matter of tellng the code generator we need those registers, it appears to be already decided at an earlier stage, as this dump of the code generated by the C statement
uchar0 = (uchar0<<1) | (uchar0>>7);
shows:
181 ;rotate4.c:28: uchar0 = (uchar0<<1) | (uchar0>>7);
182 ;ic:2: iTemp0 [k3 lr3:4 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{signed-char fixed}[a ] = (signed-char fixed)_uchar0 [k2 lr0:0 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{unsigned-char fixed}
183 ; genCast
184 ;fetchLitPair
000F 21r05r00 [10] 185 ld hl, #_uchar0
0012 7E [ 7] 186 ld a, (hl)
187 ;ic:3: iTemp1 [k4 lr4:6 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{signed-char fixed}[a ] = iTemp0 [k3 lr3:4 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{signed-char fixed}[a ] << 0x1 {signed-char literal}
188 ; genLeftShift
0013 87 [ 4] 189 add a, a
190 ;ic:4: iTemp2 [k5 lr5:6 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{unsigned-char fixed}[c ] = _uchar0 [k2 lr0:0 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{unsigned-char fixed} >> 0x7 {const-int literal}
191 ; genI80RightShift
192 ; Dump of Right: type AOP_LIT size 2
193 ; Dump of Result: type AOP_REG size 1
194 ; reg = c
195 ; Dump of Left: type AOP_HL size 1
196 ;fetchPairLong
197 ;fetchLitPair
0014 21r05r00 [10] 198 ld hl, #_uchar0
0017 6E [ 7] 199 ld l, (hl)
0018 26 00 [ 7] 200 ld h, #0x00
201 ;fetchPairLong
202 ;fetchLitPair
001A 01 07 00 [10] 203 ld bc, #0x0007
001D F5 [11] 204 push af
001E CDr00r00 [17] 205 call sru1
0021 F1 [10] 206 pop af
207 ;ic:5: _uchar0 [k2 lr0:0 so:0]{ ia1 a2p0 re0 rm0 nos0 ru0 dp0}{unsigned-char fixed} = iTemp1 [k4 lr4:6 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{signed-char fixed}[a ] | iTemp2 [k5 lr5:6 so:0]{ ia0 a2p0 re0 rm0 nos0 ru0 dp0}{unsigned-char fixed}[c ]
208 ; genOr
0022 B1 [ 4] 209 or a, c
Llines 187 and 190 show that the intermediate result of the left shift by 1 is left in A. My generated call to sru1() doesn't know any of this and leaves its result in HL, when the result is expected in C to be ORed with the previous A, which gets destroyed in the routine.
To understand how the SDCC register allocator works, I turned to the papers published by its author Philipp Klaus Krause. The first is Optimal Register Allocation in Polynomial Time. In Compiler Construction - 22nd International Conference, CC 2013, Volume 7791 of Lecture Notes in Computer Science, pages 1–20. Springer, 2013. Also of interest is Bytewise Register Allocation in SCOPES '15: Proceedings of the 18th International Workshop on Software and Compilers for Embedded Systems. They are quite heavy reading but the take away for me is that the register allocator has to know the costs and requirements of my right shift routines that fill deficiencies in the 8080 instruction set. One idea would be to look at how specialsed instructions like mult that have no alternatives and only work with particular registers are handled.
For some simple cases of shifting right by a literal, we can use the existing generation of inline assembler instructions. Then an issue is how to have different allocation behaviour between the cheaper and more expensive cases.
At the same time I've updated the sources not pertinent to the 8080 to the most recent SVN version so as not to fall too far behind.
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.