-
Improving CPU issues
08/29/2020 at 13:31 • 0 commentsI was having intermittent CPU issues. These manifested as intermittent results uploading S-Records. Sometimes I'd even make an identical build which would work one time but not the next. Something was marginal in the design.
I hooked up a logic analyzer and added test points to the I/O connector to monitor some of the internal control lines of the CPU.
IO_PIN(48) <= w_cpuClock; IO_PIN(47) <= n_WR; IO_PIN(46) <= w_nLDS; IO_PIN(45) <= w_nUDS; IO_PIN(44) <= n_externalRam1CS; IO_PIN(43) <= w_wait_cnt(3); IO_PIN(42) <= w_n_RomCS; IO_PIN(41) <= w_n_RamCS; IO_PIN(40) <= w_busstate(0); IO_PIN(39) <= w_busstate(1); IO_PIN(38) <= cpuAddress(15); IO_PIN(37) <= '0' when ((cpuAddress(23 downto 3) = x"00000"&'0')) else -- X000000-X000007 (VECTORS) '1';
The 68K CPU has two bus state lines which indicate the operation being performed. They are documented as follows:
busstate : out std_logic_vector(1 downto 0); -- 00 -> fetch code -- 10 -> read data -- 11 -> write data -- 01 -> no memaccess
Disabling accesses for the situation where the busstate = 01 cleaned up the issues I was having. Also, making peripheral accesses only active when busstate(1) = 1 protects the port a bit. RAM and ROM can have either code or data so they need to access for the situation where busstate{1) = 1 or busstate(0) = 0.
Here's the timing of the CPU coming out of reset:
This fixed the CPU so that it runs reliably at 25 MHz including downloading S-Records and running code. It broke the External SRAM but that's OK since I wanted to work on the controller anyway.
-
Another GCC 68K Cross Compiler
08/27/2020 at 10:16 • 0 commentsI've loved the (on-line GodBolt) Compiler Explorer project for a while now. It lets you type code in one window and see it compiled to assembly language in another window.
There's a Compiler Explorer site which does 68k cross compiling. This also lets you play with compiler options like optimization. One thing that you learn doing embedded system software is the C language keyword volatile. Any hardware register which is updated externally to the 68K needs to have volatile added. For instance, the ports for the VDU and ACIA can be accessed as pointers with the following defines:
#define ACIASTAT (volatile unsigned char *) 0x010041 #define ACIADATA (volatile unsigned char *) 0x010043 #define VDUSTAT (volatile unsigned char *) 0x010040 #define VDUDATA (volatile unsigned char *) 0x010042 #define TXRDYBIT 0x2
To print a character to the ACIA:
void printCharToACIA(unsigned char charToPrint) { while ((*ACIASTAT & TXRDYBIT) == 0x0); * ACIADATA = charToPrint; }
The 68K Compiler Explorer looks like:
A very nice and fast way to see what the compiler does. Setting the -O3 flag shows what the optimizer does to the code:
Nice job of optimization. If you click the green check box you can see the compiler options:
-g -o /tmp/compiler-explorer-compiler120727-1067-xxj1yf.t0dk/output.s -S -fdiagnostics-color=always -O3 /tmp/compiler-explorer-compiler120727-1067-xxj1yf.t0dk/example.cpp
-
GCC Cross Assembly Toolchain Workflow Video
08/22/2020 at 18:47 • 0 commentsA short video which shows the GCC Toolchain workflow.
The toolchain isn't making an S9 record so it's not terminating the load. Patching one at the end of the code works.
S9030000FC
-
Testing the External SRAM
08/22/2020 at 12:55 • 0 commentsNow that we have a working GCC toolchain, let's write a program to test the External SRAM. Need to use the patch that fixes the timeout for srecord loading. Using this patched version of the srecord file.
I'm using a GitHub repo to transfer data back and forth to my PC.
Also, upgraded to Quartus version 20.1. It is very slow.
Test of External SRAM passes:
* Test External SRAM * External SRAM on the RETRO-EP4CE15 card goes from 0x300000 to 0x3FFFFF (1 MB) * External SRAM only supports 8-bit accesses * TUTOR14 uses SRAM from 0x000000 to 0x000800 RAMSTART = 0x300000 RAMEND = 0x3FFFFF ACIASTAT = 0x010041 ACIADATA = 0x010043 * Code follows .ORG 0x001000 * CHECK FIRST LOCATION BY WRITING/READING 0x55/0xAA STARTTEST: MOVE.L #RAMSTART,%A0 MOVE.B #0x55,%D0 MOVE.B %D0,(%A0) NOP MOVE.B (%A0),%D1 CMP.B %D0,%D1 BNE FAIL MOVE.B #0xAA,%D0 MOVE.B %D0,(%A0) NOP MOVE.B (%A0),%D1 CMP.B %D0,%D1 BNE FAIL * WRITE INCREMENTING PATTERN MOVE.B #0X00,%D0 MOVE.L #RAMSTART,%A0 MOVE.L #RAMEND+1,%A1 CHKBLKS: MOVE.B %D0,(%A0)+ CMP.L %A0,%A1 BEQ DONEFILL ADDI.B #0x01,%D0 BRA CHKBLKS DONEFILL: * READ BACK INCREMENTING PATTERN MOVE.B #0X00,%D0 MOVE.L #RAMSTART,%A0 MOVE.L #RAMEND+1,%A1 LOOPCHK: MOVE.B (%A0)+,%D1 CMP.B %D0,%D1 BNE FAIL CMP.L %A0,%A1 BEQ DONECHK ADDI.B #0x01,%D0 BRA LOOPCHK DONECHK: * PRINT 'Pass' MOVE.B #0x0A,%D0 JSR OUTCHAR MOVE.B #0x0D,%D0 JSR OUTCHAR MOVE.B #'P',%D0 JSR OUTCHAR MOVE.B #'a',%D0 JSR OUTCHAR MOVE.B #'s',%D0 JSR OUTCHAR MOVE.B #'s',%D0 JSR OUTCHAR RTS FAIL: * PRINT 'Fail' MOVE.B #0x0A,%D0 JSR OUTCHAR MOVE.B #0x0D,%D0 JSR OUTCHAR MOVE.B #'F',%D0 JSR OUTCHAR MOVE.B #'a',%D0 JSR OUTCHAR MOVE.B #'i',%D0 JSR OUTCHAR MOVE.B #'l',%D0 JSR OUTCHAR RTS * OUTPUT A CHARACTER IN D0 TO THE ACIA OUTCHAR: BSR WAITRDY LEA ACIADATA,%A1 MOVE.B %D0,(%A1) RTS * WAIT FOR THE SERIAL PORT TO BE READY WAITRDY: LEA ACIASTAT,%A1 LOOPRDY: MOVE.B (%A1),%D1 ANDI.B #0x2,%D1 BEQ LOOPRDY RTS
Made significant improvements to above code and checked it into GitHub here.
-
Speeding up the CPU
08/22/2020 at 10:00 • 0 commentsThe 68000 CPU IC had a pin called DTACK* (Device Transfer Acknowledge). When you grounded DTACK* the CPU ran at full speed. If you wanted to slow down the CPU for a slower external device you pulled the pin high until the device finished.
The FPGA Core for the 68000 CPU has a similar pin "clkena_in" which flips the sense of DTACK* and "stretches" the clock when it is low and enables the CPU clock when high. The pin was set to high in my design and there were no wait states. This didn't work for External SRAM so I lowered the clock speed to 16.7 MHz which allowed the CPU to access slower External SRAM correctly.
Adding Wait States to Speed up the CPU
I added a wait state counter for the clkena_in signal which is activated when the CPU tries to access External SRAM. This will come in handy if I want to get the external SDRAM working. Here's the code for the wait state counter plus part of the CPU instance..
-- Wait states for external SRAM w_cpuclken <= '1' when n_externalRam1CS = '1' else '1' when ((n_externalRam1CS = '0') and (w_wait_cnt >= "0100")) else '0'; -- Wait states for external SRAM process (i_CLOCK_50,n_externalRam1CS) begin if rising_edge(i_CLOCK_50) then if n_externalRam1CS = '0' then w_wait_cnt <= w_wait_cnt + 1; else w_wait_cnt <= "0000"; end if; end if; end process; CPU68K : entity work.TG68KdotC_Kernel port map ( clk => w_cpuClock, nReset => w_resetLow, clkena_in => w_cpuclken,
As a result I was able to move the CPU speed back to 25 MHz with no wait state for any other accesses.
The External SRAM chip select is now 120 nS (3 CPU clocks). It could easily be made shorter since my External SRAM is 45 nS parts. Would still need additional time for propagation delays and setup times so I will leave it as is for the moment..
-
Improvements for the Cyclone V FPGA
08/21/2020 at 20:37 • 0 commentsThe Cyclone V FPGA (PN: 5CEFA2F23) has a lot more internal SRAM than the EP4CE15. I was able to add 96KB of internal SRAM. Due to the TS2 memory map the new memory could not be contiguous with the lower RAM. That's because the Tutor ROM is located at 0x008000-0x00FFFF. The new internal SRAM is at 0x200000-0x217FFF.
I also got the external SRAM working - well sorta working. It is only 8-bits and the 68000 CPU does not do dynamic bus sizing so it has to be accessed as bytes. But I tested some locations and they worked fine. The External SRAM is from 0x300000-0x3FFFFF.
-
GCC Assembler Options
08/18/2020 at 10:56 • 0 commentsThe tools are here:
Needs path:
Help:
Add list file:
Produces listing:
Nice!
Information about the ELF format.
Jeff Tranter's Makefile splits the code into two ROMs. I've started to alter the code to make a MIF file for Quartus II.
Build fails at objcopy.
From the objcopy page:
The gnu objcopy utility copies the contents of an object file to another. objcopy uses the gnu bfd Library to read and write the object files. objcopy can be used to generate S-records by using an output target of `srec' (e.g., use `-O srec').
The failing line in Makefile is:
Eliminated the -I coff-m68k and it got past the line.
Need to shift down the output by 0x8000 to load properly into Quartus. Added to Makefile:
Output hex file looks like:
File matches Jeff's original file:
-
Building TUTOR with GCC Toolchain
08/12/2020 at 18:26 • 0 commentsIn the last log I got the toolchain working on my Raspberry Pi to cross compile code for the 68K CPU.
In this log, I will try and assemble the TUTOR 1.3 code using the toolchain. I am using FileZilla to transfer the file to the Raspberry Pi. Here is the command line to assemble the code (adapted version of Jeff Tranter's command line):
./m68k-coff-as -m68000 -alms -a=tutor13.lst -o tutor13.o tutor13.s
When I run it I get a lot of errors starting with:
./m68k-coff-as -m68000 -o tutor13.o tutor13.s tutor13.s: Assembler messages: tutor13.s:5812: Error: value out of range tutor13.s:5812: Error: Value of -132 too large for field of 1 bytes at 0xf82bad tutor13.s:6822: Error: value out of range tutor13.s:6822: Error: Value of 142 too large for field of 1 bytes at 0xf83421 tutor13.s:6838: Error: value out of range tutor13.s:6838: Error: Value of 162 too large for field of 1 bytes at 0xf83449
Line 5812 is a short branch instruction:
BSR.S COMMAS20
I thought short branches are 16-bits? If so, then a branch of -132 should not be too far away. Same for the rest of the list. I wonder if there's a missing option on the assemble line to specify the size of the BRA.S instruction?
./m68k-coff-as -help Usage: ./m68k-coff-as [option...] [asmfile...] Options: -a[sub-option...] turn on listings Sub-options [default hls]: c omit false conditionals d omit debugging directives h include high-level source l include assembly m include macro expansions n omit forms processing s include symbols =FILE list to FILE (must be last sub-option) --alternate initially turn on alternate macro syntax -D produce assembler debugging messages --defsym SYM=VAL define symbol SYM to given value -f skip whitespace and comment preprocessing -g --gen-debug generate debugging information --gstabs generate STABS debugging information --gstabs+ generate STABS debug info with GNU extensions --gdwarf-2 generate DWARF2 debugging information --help show this message and exit --target-help show target specific options -I DIR add DIR to search list for .include directives -J don't warn about signed overflow -K warn when differences altered for long displacements -L,--keep-locals keep local symbols (e.g. starting with `L') -M,--mri assemble in MRI compatibility mode --MD FILE write dependency information in FILE (default none) -nocpp ignored -o OBJFILE name the object-file output OBJFILE (default a.out) -R fold data section into text section --statistics print various measured statistics from execution --strip-local-absolute strip local absolute symbols --traditional-format Use same format as native assembler when possible --version print assembler version number and exit -W --no-warn suppress warnings --warn don't suppress warnings --fatal-warnings treat warnings as errors --itbl INSTTBL extend instruction set to include instructions matching the specifications defined in file INSTTBL -w ignored -X ignored -Z generate object file even after errors --listing-lhs-width set the width in words of the output data column of the listing --listing-lhs-width2 set the width in words of the continuation lines of the output data column; ignored if smaller than the width of the first line --listing-rhs-width set the max width in characters of the lines from the source file --listing-cont-lines set the maximum number of continuation lines used for the output data column of the listing 680X0 options: -l use 1 word for refs to undefined symbols [default 2] -m68000 | -m68008 | -m68010 | -m68020 | -m68030 | -m68040 | -m68060 | -m68302 | -m68331 | -m68332 | -m68333 | -m68340 | -m68360 | -mcpu32 | -m5200 | -m5202 | -m5204 | -m5206 | -m5206e | -m521x | -m5249 | -m528x | -m5307 | -m5407 | -m547x | -m548x | -mcfv4 | -mcfv4e specify variant of 680X0 architecture [default 68020] -m68881 | -m68882 | -mno-68881 | -mno-68882 target has/lacks floating-point coprocessor [default yes for 68020, 68030, and cpu32] -m68851 | -mno-68851 target has/lacks memory-management unit coprocessor [default yes for 68020 and up] -pic, -k generate position independent code -S turn jbsr into jsr --pcrel never turn PC-relative branches into absolute jumps --register-prefix-optional recognize register names without prefix character --bitwise-or do not treat `|' as a comment character --base-size-default-16 base reg without size is 16 bits --base-size-default-32 base reg without size is 32 bits (default) --disp-size-default-16 displacement with unknown size is 16 bits --disp-size-default-32 displacement with unknown size is 32 bits (default) Report bugs to
Looking at the difference between BRA and BRA.S shows that the branch is being treated as a byte offset instead of a short offset.
5810 ???? 6100 F70A BSR EAZ 5811 5812 ???? 617C BSR.S COMMAS20 5813
Toolchain Problem
It looks like the assembler did install but the C compiler is missing and it really would be worthwhile to have a C cross-compiler that lets me generate code under Windows. Trying a different way of making the gcc toolchain (Building the 68000 cross compiler - Automated).
Another Try at the GNU GCC Toolchain
I went onto the 68000 Assembly Language programming group on facebook and got a pointer to Steve Moody's gcc for 68k github page that eventually got me running on the GCC toolchain.
The toolchain is running in VirtualBox under Linux.
Steve has a Makefile that does most of the work The one thing I did have to change was to set the variable PWD to the path to install. I got my son who's a wiz at all things Linux to help me figure out why it was crashing.
PWD = /home/doug/m68k-elf-toolchain
If you try it there are certainly better ways and your path will not be the same. You can get the path with the pwd command but you have to be in the right folder in terminal.
I was rewarded with a GCC toolchain that assembled code correctly.
I then installed srecord with:
sudo apt install srecord
That installs the srecord program srec_cat. It installed version 1.58.D001 on my install.
Typing the line to assemble:
/opt/m68k-elf/bin/m68k-elf-as monitor.s -o monitor.o
Created code.
-
Install gcc cross compiler toolchain on a Raspberry Pi
08/12/2020 at 11:26 • 0 commentsI have a Raspberry Pi that I use as a local network router to isolate my lab from the rest of the network. It has a Raspberry Pi 3 with wireless built in and an Ethernet hub locally on the test bench in the lab. I'd like to use that as my compile machine. The machine has the latest versions of binutils and gcc but the instructions from here indicate that support for the 68K was discontinued in the past so it's better to install older versions of binutils and gcc.
Dave Shepperd reported that the last binutils version to support the m68k-coff target is 2.16.1. Using GCC >= 4.3 also seems to be a bit trickier as it requires additional software to be installed (e.g. GMP).
I followed these instructions to free up space in the Raspberry Pi (it had libre office and other things I don't use on the machine). I have a bit under 1GB of free space left.
I am running Raspbian version:
pi@raspberrypi:~ $ cat /etc/os-release PRETTY_NAME="Raspbian GNU/Linux 9 (stretch)" NAME="Raspbian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" VERSION_CODENAME=stretch ID=raspbian ID_LIKE=debian HOME_URL="http://www.raspbian.org/" SUPPORT_URL="http://www.raspbian.org/RaspbianForums" BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
My Pi was already set up for SFTP so I used FileZilla to connect to the Pi. Downloaded to:
/home/pi/Downloads tar -xjf binutils-2.16.1a.tar.bz2 tar -xjf gcc-4.2.4.tar.bz2 rm *bz2
The binutils made fine but the gcc had a build error. The assembler does look like it's present:
pi@raspberrypi:/opt/m68k/m68k-coff/bin $ ls -al total 14528 drwxr-xr-x 2 root root 4096 Aug 12 08:15 . drwxr-xr-x 4 root root 4096 Aug 12 08:15 .. -rwxr-xr-x 2 root root 1597980 Aug 12 08:15 ar -rwxr-xr-x 2 root root 2697432 Aug 12 08:15 as -rwxr-xr-x 2 root root 2372272 Aug 12 08:15 ld -rwxr-xr-x 2 root root 1699616 Aug 12 08:15 nm -rwxr-xr-x 2 root root 2541904 Aug 12 08:15 objdump -rwxr-xr-x 2 root root 1597988 Aug 12 08:15 ranlib -rwxr-xr-x 2 root root 2343228 Aug 12 08:15 strip pi@raspberrypi:/opt/m68k/m68k-coff/bin $ as -version GNU assembler (GNU Binutils for Raspbian) 2.28 Copyright (C) 2017 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `arm-linux-gnueabihf'.
Note, this is the wrong assembler! The correct path is:
pi@raspberrypi:/opt/m68k/bin $ ls m68k-coff-addr2line m68k-coff-ld m68k-coff-ranlib m68k-coff-strip m68k-coff-ar m68k-coff-nm m68k-coff-readelf m68k-coff-as m68k-coff-objcopy m68k-coff-size m68k-coff-c++filt m68k-coff-objdump m68k-coff-strings pi@raspberrypi:/opt/m68k/bin $ ./m68k-coff-as -version GNU assembler 2.16.1 Copyright 2005 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License. This program has absolutely no warranty. This assembler was configured for a target of `m68k-coff'.
Create a simple file:
pi@raspberrypi:/opt/m68k/bin $ more test.asm nop
Assemble:
sudo ./m68k-coff-as test.asm
Produces a.out file.
pi@raspberrypi:/opt/m68k/bin $ ls -al total 23472 drwxr-xr-x 2 root root 4096 Aug 12 13:12 . drwxr-xr-x 8 root root 4096 Aug 12 08:15 .. -rw-r--r-- 1 root root 286 Aug 12 13:12 a.out
Assembling with list option.
sudo ./m68k-coff-as test.asm -a=test.lst
List file is:
more test.lst 68K GAS test.asm page 1 1 0000 4E71 nop 2 ^L68K GAS test.asm page 2 DEFINED SYMBOLS e0:00000000 .text e1:00000002 .data e2:00000002 .bss NO UNDEFINED SYMBOLS
Looks right!
Still fits on the 8GB SD card.
pi@raspberrypi:/opt/m68k/bin $ df -h Filesystem Size Used Avail Use% Mounted on /dev/root 7.2G 6.5G 407M 95% / devtmpfs 465M 0 465M 0% /dev tmpfs 469M 0 469M 0% /dev/shm tmpfs 469M 19M 451M 4% /run tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 469M 0 469M 0% /sys/fs/cgroup /dev/mmcblk0p1 44M 23M 22M 52% /boot tmpfs 94M 0 94M 0% /run/user/1000
Now, onto rebuilding TUTOR.
-
More SRAM than TS2
08/11/2020 at 20:44 • 0 commentsThe resource utilization in the 5CEFA2F23 FPGA Card is:
There's more than enough left over block memory to have 128KB of SRAM. The only complication is that the TS2 memory map wasn't designed for more than 32KB. That is because it places the EPROM above the SRAM in the memory space.
The monitors TS2BUG and TUTOR could be re-assembled to move the ROM to higher memory.
The 5CEFA2F23 FPGA Card also has 32MB of DRAM which could be used as system memory and it's organized as 16MB x 16-bits which is ideal. The 24-bit address space of the 68000 only allows 16 MB to be addressed so not all of the DRAM could be mapped to the memory space. And it might require wait states for memory access.
Alternately, the fast SRAM in the FPGA could function as a cache memory but would require a much more complicated controller. DRAM does bursts quite nicely and it could fill a cache line pretty efficiently.
I think I will start with getting 128KB of SRAM working first.
Changes to Monitor Code
This ought to be as easy as changing the .ORG statement to point to the new ROM base address and changing the address decode in the FPGA. The assembly code to change the ROM base address is:
.ORG 0xF80000
Here are the instructions to make the ROM file.
Changes to memory map
The peripherals are currently located at 0x01______ addresses and should be moved up as well. I will put the ROM from 0xF80000 to 0xFFFFFF and the I/O to start at 0xF00000 which would allow up to 512KB of ROM
The new ROM chip select VHDL code is:
w_n_RomCS <= '0' when (cpuAddress(23 downto 19) = x"F"&'1') else -- xF80000-xFFFFFF(MAIN EPROM) '0' when (cpuAddress(23 downto 3) = x"00000"&'0') else -- X000000-X000007 (VECTORS) '1';
The new RAM chip select VHDL code is:
w_n_RamCS <= '0' when ((w_n_RomCS = '1') and (cpuAddress(23 downto 17) = x"0"&"000")) else -- x000008-x01ffff (128KB) '1';
I saved new versions of both the monitor source code and FPGA and am working with the new versions.
The I/O map changes to:
PDI1 = 0xF00000 | PARALLEL PORT ADDRESS PITCDDR = 0xF00009 | PORT C DATA DIRECTION REGISTER PITPCDR = 0xF00019 | PORT C DATA REGISTER PITTCR = 0xF00021 | TIMER CONTROL REGISTER PSTATUS = 0xB | PRINTER STATUS PBDATA = 3 | PRINTER CONTROL--BUSY,PAPER,SELECT PDATA = 1 | PRINTER DATA SER1 = 0xF00040 | TERMINAL SER2 = 0xF00041 | SERIAL PORT2 ADDRESS
I need to get the 68K assembler working. Jeff Tranter had a Linux build he used for his assembly. I will probably try and get it installed on a Raspberry Pi. Here's the installation instructions.