The time had come to make a new prototype, one with all the hardware needed to accomplish the goals of this project! The front end was well proven at this point, and just needed a slight shrink to fit under an off the shelf RF shield. The ADC had always behaved well during my tests, but it needed a new (and untested) clock generator since the one I had prototyped with wasn't suited for it. Most disturbing of all, I needed to design with an Artix-7 FPGA and DDR3 RAM in BGA packages for the first time.
Tackling that last point first, I saw way too much risk in putting these BGA parts down on one board that I hand stencil and reflow solder on a hot plate. Not just that, but I only had three months until I had to submit this project to graduate my electrical engineering program and had no experience working with DDR3 nor even large BGA packages. I committed to learning these skills for the next revision, but had to find something to tide me over in a hurry.
Enter, the TE0712-02 FPGA module. This bad boy had two DDR3 ICs, the second largest Artix-7 part, and only needed a 3.3V rail to operate. As my favorite circuits professor put it, "Simplicity itself".
---------- more ----------
The top level of the schematic serves as a good block diagram for the entire prototype. The four front end outputs go to the ADC to be digitized, and the data and clock lines carrying the sample data are connected to the FPGA module, which also takes in all the control signals for the board. The FPGA module also connects to a standard PCIe x4 edge connector, which I used a project template to generate along with the specific board shaped needed to comply with the PCIe mechanical specifications. I included the PCIe mounting bracket (SHLD3) as a BOM (bill of materials) only part (not soldered on the board) so I don't forget to order it! I included a 74 series logic buffer (U15) to drive the probe compensation terminals (which I couldn't find the proper part for, so I just used right angle 0.1" headers) from the FPGA. PCIe has two power rails, 3.3V and 12V. The 3.3V rail is limited to 3A, so I decided to use the 12V rail to power the front end in case I drew too much current for just the 3.3V rail. To do this, I used a buck converter (U1) to step the voltage down to 5.5V, which was regulated to 5V with a linear regulator to reduce switching noise in the front end's power rail. I placed shunt resistors and test points on every regulator's input and output, so I can easily measure the current draw on each rail and the efficiency of the regulators.
Moving on to the front end block, we see four front end channels, which will lead us deeper into this hierarchical rabbit hole. On the top of this sheet, we have the voltage regulation for the front end, both positive (U2, an LDO) and negative (U3, a charge pump with integrated LDO) 5V rails. A -2.5V reference for every channel's DC bias circuits is generated from this negative 5V rail using a divider (R14,R17) and buffered with an opamp (U4), this replaces the per channel divider from the previous design. A quad channel DAC (U5) is used to trim the DC bias values for each channel. This DAC is powered by (and referenced off of) the 5V rail, so it needed a level shifting circuit (Q1,Q2, and pull-up resistors) to match the 3.3V level of the I2C bus going to the FPGA. I also added test points on every signal going to the FPGA, better to be safe than sorry!
Last level of hierarchy before we get to the juicy stuff, I promise! On the left is the BNC input to the channel (J3), this is where the rubber meets the road (or where the scope meets the probe, ha). The RF shield I mentioned earlier comes in two parts, a frame (SHLD1) which is soldered onto the board, and a cover (SHLD2) which is press fit onto the frame. As with the PCIe bracket, I included the cover in the schematics and therefore the BOM so I don't forget to order it along with the rest of the parts. In order to test the front end channels, I included some UFL connectors (J1,J2) to bring the PGA output out as two 50Ω signals. When these connectors are used, the resistors (R28,R29) are removed to disconnect the output from the rest of the board to avoid doubly terminating the signal.
The input circuit was slightly simplified, removing the 6pF input cap right after the gas discharge tube and rolling it into the attenuator to get 15pF of input capacitance (provided by C32) when the relay is on. A variable capacitor (C31) was added to allow the capacitance on the non-attenuated branch to be changed to better match the attenuated branch. I also reduced the number of resistors in the attenuator, since it made no sense to worry about hitting an exact value when the resistors themselves have a 1% tolerance!
Continuing the trend of reducing the part count of this design, I replaced the two resistors that it took to get 900kΩ with a single 910kΩ, and replaced the 100kΩ that made up the other half of these dividers with 90.9kΩ to get the 1MΩ total (999.9kΩ, close enough!) needed. I also moved the protection diode over to the gate of the JFET (Q3) so that it wouldn't clamp on a DC over-voltage condition that the input would otherwise be able to handle (due to the divider on the opamp input, that node can handle up to 50VDC). I changed the value of R35 to 1KΩ from 3kΩ and C19 to 1uF from 330pF, using simulations to find common values to replace these values that didn't exist in the rest of the design. I did this to reduce BOM line items, which can reduce cost of assembly when going to a contract manufacturer. As a result of these changes, C20 also changed to 1uF from 100nF. I also split the 50Ω resistance at the emitter of Q4 into two 100Ω resistors (R37,R38) to better handle the 40mA maximum current flowing through it. Aside from using a buffered -2.5V reference, I added a trimmer potentiometer to the DC offset feedback divider to better tune the DC gain to match the AC gain of the circuit.
After that laundry list of a change log, I am pleased to tell you that nothing has changed in the PGA sheet. Better yet, we've completed our dive into the front end block. Now to resurface and explore the ADC and it's clock generator, which fit in just one page!
The hierarchy may be shallow, but this page wasn't smooth sailing! Luckily enough, not much has changed with the ADC or it's voltage regulation. The one major change was swapping the datasheet recommended input network to 100Ω resistors to terminate the PGA outputs properly.
The PLL I already prototyped with would provide very marginal performance as the ADC's clock, so I looked for something that would really make the most of the ADC. It needed to have low jitter, with a target to beat of 760 fs RMS measured from the previous circuit. To drive the ADC optimally, it also needed to have a LVDS or LVPECL digital differential output. I also wanted the simplest solution I could find to minimize the chances of this failing, since I likely wouldn't have time to make a new revision of this board by the capstone project deadline (so I was minimizing the chances of ME failing as well!).
In came TI to my rescue (a sentence not repeated since the start of the component shortages) with a wonder-chip. The LMK61E2 (U13) includes its own reference oscillator and loop filter, can output LVPECL, is configurable via I2C, and has a typical RMS jitter of 90 fs! This amazing chip cost almost twice as much as the previous solution, but it was a small price to pay for something that was almost sure to work. The almost in that sentence is why I included UFL connectors to pipe in a clock from somewhere else, or more optimistically, measure the clock generated by this chip. The LVPECL output requires a specific termination scheme, I used the convenient LVDS-like termination (R71,R74,R75) described in page 9 of this Renesas app note. I also added local regulation (U12) for this part to ensure a noise-free 3.3V rail, as opposed to the main 3.3V which is taken directly from the host PC (or Thunderbolt enclosure!) through the PCIe connector.
All roads lead to Rome and all connections lead to the FPGA! The FPGA module has three Samtec LSHM connectors which bring out the FPGA's IO banks and their voltage rails. All the bank IO voltages are set to 3.3V except bank 16, which must be 2.5V (regulated by U14, an LDO) to enable on-chip termination for LVDS. I brought the JTAG lines out to a header (J6) that matches the pinout of my programmer and included four LEDs (on a seperate page to save space) for debugging.
Here's how all of that looks on a board! I decided to get fancy with it and digitized my group members initials to include in an ENIG gold finish. And speaking of, what kind of software magic did the rest of my group get up to? Stay tuned for the next posts, which will be all about the software that makes this thing tick!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
Now that I knew that the throughput to the PC could match the ADC’s rated sample rate of 1 GS/s, I had to make a circuit that clocked the ADC at that rate as well. This circuit needed to output at 1 GHz with very low jitter, as any jitter on the ADC sample clock will turn into noise during the conversion process.
The heart of the clock generation circuit is the phase locked loop (PLL). Without getting into too much detail, the PLL compares the phase of a low frequency reference (generally from a crystal oscillator) with a divided down copy of a high frequency that is generated by a voltage controlled oscillator (VCO), which it tunes until the two match. By changing the division settings any frequency can be synthesized, with the accuracy and jitter characteristics of the reference conferred onto the output.
Looking at the other scopes that use the same ADC, I found that many also used the ADF4360-7 in their clock generation circuit. I did some research on the part and it seemed to be the cheapest solution that would give me the 1 GHz output I needed. This chip had an integrated VCO, so the only other parts I needed were the reference oscillator and some passives. Saving me loads of digging into the datasheet, Analog Devices had a tool for calculating all the values of the passives as well as the register values to program for a given output frequency.
That sticky note yellow colour... The navy blue connections... That's not KiCad! It's true, it was at this point that I was offered an Altium license through my school. And with the size and scope of the next board already in mind, and the year of internships working with it, I decided to switch over. As for the design, I chose to use two 50Ω resistors (R5, R6) to bias the output as opposed to a more complicated matched network. The reference oscillator (Y1) was a 16 MHz crystal oscillator, which came temperature compensated for added frequency stability, and the LDO (U2) was a low noise part to avoid noise on the power rails affecting the performance of the circuit. Decoupling cap values were copied from the part's evaluation board and the rest of the passive values were taken from the design tool.
Pictured here, a 1 GHz postage stamp! I didn't have any decent way to test it on its own, so I hooked the SPI bus up to the rest of the oscilloscope prototype and updated the software to set all the registers on the chip at boot.
First I connected the RF output to a balun on a scrap ADC board to generate a single ended output that I could test on my spectrum analyzer. I then verified that it output at 1 GHz and used KE5FX's excellent GPIB toolkit to measure its phase noise performance against the simulation values from the tool as well as calculate total RMS jitter.
Here it is against my RF signal generator (in pink). The 100 Hz range was off, but the other ranges matched the simulations pretty well. The RMS jitter from 1.00kHz to 1.00MHz (didn't have a screenshot of this range, so the numbers are different here) was 760 fs vs. a simulated value of 580 fs. All of this looked promising, so I moved on to functional testing.
I hooked up the RF output into the ADC board through the two UFL connectors I included for differential inputs and updated the FPGA code to reflect the new clock rate. I then ran a quick capture to a CSV file, and the script hanged! That was odd, so I started debugging. Eventually, I found that the ADC wasn't outputting a clock at all! I looked through the clocking section of the ADC datasheet and this line jumped out at me:
"For differential sine wave clock input the amplitude must be at least ± 0.8 Vpp."
A quick trip to the dBm conversion table later, I found that I needed at least 2 dBm of output power. I had about -5 dBm! The matched output network I mentioned earlier would net me an output of -2 dBm according to the datasheet, which is still not up to spec.
My conclusion is that the circuit would probably work, given it's what the other manufacturers use, but it would have to be very marginal. The goal of this project is to make a better oscilloscope at the same price point by offloading so many costly aspects of a benchtop oscilloscope to the user's PC. This allows me to spend more on components to get the performance I want out of this design. With this in mind, I decided not to bother squeezing out every dBm just to reach the bare minimum the ADC would function on, and use a different clock generator that would make the most of the ADC.
Unfortunately, I was running out of time in my final term to get this project done. I had to go straight to the final design, a x4 PCIe card that incorporated all the other blocks I've written about, as well as a new (and untested) clock generator! Follow this project for the whopper of a project log that's coming up, as well as some posts about all the software work the rest of my group was doing as I was designing the hardware!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
Now that the front end was in a satisfactory state, it was time to revisit the architecture of the digital interface. At this point it had been over a year since I designed that board. I chose a USB 3 Gen 1 interface capable of 400 MB/s (which proved to be 370 MB/s in practice) as a stopgap to develop on until a USB 3 Gen 2 chip was released that could match the 1 GB/s throughput of the raw ADC data. Unfortunately, the FX3G2 on Cypress's USB product roadmap failed to materialize, leaving me with few options.
I considered using the Cyclone 10 GX (which is the cheapest FPGA with the needed 10 Gb/s transceivers) with USB 3 Gen 2 IP, but even this couldn't reach 1 GB/s, topping out at 905 MB/s according to the vendor's product sheet. I considered PCIe, which is super common on FPGAs, with free IP and loads of vendor support! However, that would seem to limit this to desktops, since most people don't have PCIe slots on their laptops.
They did have the next best thing though! Thunderbolt 3 (and now USB 4 and Thunderbolt 4) supports up to four lanes of PCIe Gen 3 at a maximum throughput of 40 Gb/s. Perfect! Unfortunately, though the chips themselves are freely available on Mouser, the datasheets are not. I didn't worry about that yet, as I could prototype the system as if it was just a PCIe card by using an external GPU enclosure. This review and teardown really showcased how simple the extra Thunderbolt 3 circuitry was, so I didn't feel like it was a big stretch to incorporate it once the PCIe design was tried and true. I bought the enclosure and got to work finding a new FPGA to do all the PCIe magic.
I used this list of FPGA development boards to find the most affordable way to start prototyping with PCIe. This turned out to be the Litefury, an Artix-7 development board which appears to be a rebadged SQRL Acorn CLE-215+ (an FPGA cryptomining board). Although this board had the four lanes of PCIe I needed, it came in an M.2 form factor so it needed an adaptor. It didn't have a built in programmer either, so I used this one, which was the cheapest one that worked directly with Vivado (Xilinx's IDE for their FPGAs).
Shown above is the Vivado block diagram of the Litefury example design, this design allows DMA access from the PC to the onboard DDR3 memory and vice verse. I would use this to verify the transfer speeds when connected directly to a desktop PC compared to those through Thunderbolt when it was installed in the enclosure. I installed the XDMA drivers (which I had to enable test mode in Windows for, since the driver is unsigned) and ran a basic transfer with the maximum transfer size of 8 MB.
It took 7.072 milliseconds to receive 8 MB, which is just over 1.1 GB/s! Best of all, this number didn't budge when I tested it over Thunderbolt!
This inspired me to finally gave this project it's name: ThunderScope!
Follow this project to catch my next post on designing a 1 GHz PLL to take advantage of this blazing fast transfer rate, and then promptly learning my lesson about cribbing off the other oscilloscope manufacturers!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
It was time to see if the third time really was the charm and test the newest revision of the front end! The first task was to test the front of the front end (FFE) - the coupling circuit, attenuators and input buffer.
Look ma no probes! I started off by verifying the DC bias voltage at the output, which was just about the 2.5V I expected. The exact value of the bias voltage isn't important as it will be matched by the trimmer DAC once the channel is calibrated. I tested the AC coupling by adding a DC component to the signal, which caused no change to the DC voltage at the output. Next, I enabled DC coupling and confirmed that this DC component was now added to the bias voltage at the output. I then measured the DC gain, which was just under unity. After the coupling tests, I switched on the attenuator and was greeted with a flat output - no oscillations this time! I cranked my function generator to the highest voltage it could do, and lo and behold I could see the signal again, now attenuated by a factor of 100.
I then connected the FFE to the PGA and used the front end tester board to test the frequency response of the whole front end. I did this to avoid loading down the FFE’s buffer circuit with the high input capacitance (13 pF) of an oscilloscope input.
The frequency response certainly looked more promising than the previous attempts! The bandwidth was about 230 MHz, out of the 350 MHz promised by the simulations. This alone wouldn’t be too much of an issue if I scaled back the bandwidth requirement to 200 MHz. The real issue here is the flatness of the response, which is over +/- 0.5 dB when it should ideally be +/- 0.1dB. That means that on a scope with this front end, a 100 MHz clock would look 10% larger than a 32 MHz clock!
These peaks and valleys in the frequency response could have been caused by parasitics (unwanted inductance and capacitance) in the layouts of the two boards and in the connection between them. To reduce these parasitics and improve the bandwidth and flatness of the frequency response, I combined both FFE and PGA into one front end board, moving all the parts closer together to shrink the layout.
This new board improved the bandwidth to 260 MHz and the flatness to 0.25 dB. This was clearly a step in the right direction, but also showed that the likely culprits were the components on the board. I resolved to tweak the component values to improve the response later, but was satisfied enough to keep this design and continue on to a very exciting new development in this project - breaking the 1 GB/s barrier!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
At this point, there was one big issue with the front end. The attenuators could not be switched in without causing the whole circuit to oscillate! This issue was compounded by the maximum 0.7 V output of the PGA as well as the massive cost of the design (three relays and an unobtainium opamp don't come cheap). Since I already had to use digital gain to boost the output of the PGA, I decided to remove the opamp gain stage present in the current front of front end (FFE) board and replace it with a unity gain (x1) buffer. Using a unity gain buffer would allow me to remove one of the attenuators, as it would not need to scale the input voltage just to gain it up anyway. I would also need to use an active level shifting circuit instead of the resistive divider to avoid losing half the signal shifting it up to a DC level of 2.5V. Below is the spreadsheet I used to plan out the attenuation and gain needed for all the voltage division settings.
Let's take a look at the schematic, starting from the input coupling and attenuation block. I chose to remove the 50Ω termination relay to lower cost per channel since this wasn't a feature often used or provided on entry level scopes like this one. The move to one attenuator also saved another relay's worth of materials cost, and I replaced the mechanical relay used for the coupling cap with a solid state relay (U2) to further reduce cost. The input coupling cap and its relay were moved from behind the attenuator to in front of it. This maintains consistent input impedance behavior in AC-coupled mode regardless of the attenuator state, as before it would go from infinite resistance at DC to the 1 MΩ impedance of the attenuator when the attenuator was switched on.
Taking inspiration from the example oscilloscope circuit on page 34 of the LMH6518 datasheet, I used a JFET (Q1) as an AC-coupled input buffer alongside a opamp (U1) to handle the DC portion of the signal while adding the 2.5V offset needed for the PGA input. A JFET was a great choice for a front end buffer since they have very high input impedance and contribute very little noise to the signal. I used a clever circuit from page 34 of Jim Williams' AN47 application note to automatically bias the JFET at IDSS. This point is defined as the current at which the voltage between the gate and source is zero, resulting in a gain of exactly one - great news for our buffer! The circuit works by having the opamp (U3) adjust the current through the JFET using the BJT (Q2) until the filtered DC voltage at the output is equal to the DC component of the input (generated by U1) which by the definition above results in IDSS!
Hopefully this mashup of two interesting circuits makes for a working front end! Join me in the next project log where I go through the testing and results for this board and talk about the next steps I took to perfect this design.
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
Now that the FPGA code was done, I could finally assemble and test the whole system. There were many untested blocks at this point, so each block was tested incrementally to pinpoint any issues. Once these incremental tests were done, the final test would be hooking up a signal to the front end and getting the sampled signal data back to the host PC.
The first of the incremental tests I did on the system was to turn a relay on in the front end. This would confirm that the FT2232 chip as well as the FT2 Read interface, FIFO and I2C FPGA blocks were working correctly. I figured out which bytes to send based off of the IO expander IC's datasheet and made a quick python script using pyserial to send the data (this interface on the FT2232 looks like a serial port to the PC). I executed the script and heard the clack of the relay on the front end board, it worked!
Next up, I would send a SPI command to the ADC to come out of power down mode. The ADC clock starts running when it goes into active mode, so I programmed the FPGA to blink the LEDs if it gets a clock from the ADC. This would confirm that the SPI FPGA block and ADC board worked. Some more datasheet searching and a new line of python later, I was greeted with a well-deserved light show from the (too-bright) LEDs on the digital interface board.
I tested the maximum transfer rate next. To do this, I lowered the clock generator'sfrequency from 400 MHz (theoretical maximum throughput of the FT601) down until the FIFO full flag (which I tied to an LED for this test) was not set while running transfers using FTDI's Data Streamer Application. This resulted in a consistent data throughput of 370 MB/s. This also verified that the FT6 Write block was initiating transfers correctly when the requests came in from the host PC.
Up to this point, I didn't check the actual data coming in, only that the transfers were happening. I enlisted the help of a more software-savvy classmate (this scope would become our capstone project in a later term) to modify the data streamer code to dump a csv file from the data received. I then set the ADC to output a ramp test pattern. Since this pattern was generated inside the ADC, it would test only the FPGA blocks and not the front end. I captured the data and got what i expected: a count up from 0 to 255 and back to 0, over and over again. I did a basic check through the file and found no missing counts, this meant the transfers were completing smoothly with no interruptions in the FIFO or in the USB interface.
Finally, I hooked up my function generator to the front end, got together the set of commands needed to start sampling and sent them to the ADC. This would be the final test, a real signal in and sampled data out.
WE HAVE A PULSE! IT LIVESSSS! I was very happy to see the whole system working, but it had a long ways to go to meet the goal of this project. First of all, the front end still only supported a select few voltage ranges since the attenuators didn’t work. Secondly, the ADC’s sample rate was limited to 370 MS/s (of the 1 GS/s it was capable of) by the FT601’s maximum sustained transfer rate of 370 MB/s. And of course, software needed to be made to stream, process and display the data in real time. In my next blog post, I’ll recount how I fixed the front end issues and lowered the system’s materials cost with a new architecture!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
Learning to write code for an FPGA requires a change in mindset compared to programming for microcontrollers. You are no longer writing code to execute in order, but are describing hardware that the FPGA can reconfigure itself to be. This is why FPGA code is called HDL (Hardware Description Language), of which Verilog and VHDL are the two most common. I decided to use Verilog on this project and used these great tutorials from Alchitry to get started with it.
I decided to approach the FPGA design like I did with the rest of the project and broke it up into small parts that could be tested on their own before connecting them as a system. And just like the hardware for this project, I started off by making a block diagram to plan the design out.
High bandwidth data flows only in one direction from the ADC to the FT601 USB 3 interface. This data comes in at clock rate too fast to process in normal FPGA fabric (fabric being the sea of logic elements that makes up the bulk of the FPGA) and must be "geared down" to a slower clock rate. This was done with a built in peripheral called a SERDES that takes in 8 bits of data at the higher clock rate (LCLK) and outputs them in parallel at 1/8th the clock rate (DIVCLK). These 8 bits may not all belong to the same byte, so the ADC provides a frame clock (FCLK) to synchronize the SERDES with the byte boundaries. Since the ADC has 8 lanes of serial data, each lane has a SERDES and all of the parallel outputs are combined into 64 bits of parallel data.
This data is then written to a FIFO, which does a few important things in this design. First, it changes the data width to the FT601's native 32 bits. Second, it provides that 32 bit data at the clock of the FT601, which is double the speed of, but asynchronous to, the divided ADC clock of the input data (this is a clock-domain crossing, which is a big deal in FPGA land!). And third, it buffers the data in the microseconds that the host PC takes between reads, which would otherwise result in lost sample data. Luckily enough, I could automatically generate a suitable FIFO in Xilinx's ISE tools. I only had about a few weeks of Verilog under my belt at that point, and I did not want to risk messing up such an important part of the design by writing it myself.
I wrote the FT6 Write block to work with the 245 Synchronous FIFO mode on the FT601 to write data to it. This was the simplest interface the FT601 offered, with only clock, data, and three control signals. When the PC requests data from the chip, TXE_N goes low (The host PC must poll the chip constantly to get a stream of data). The FT6 Write logic must then wait two cycles before taking WR_N low and sending the parallel data. The byte enable signal BE is always high since all the bytes need to be sent for maximum throughput.
Moving on to the low-speed path, the module I started with was FT2 Read. The FT2232 offered a very similar interface to the FT601 with the option to forego a clock and operate asynchronously! I chose to use this asynchronous mode assuming it would be even simpler than the synchronous mode I used on the FT601. In a way I was right, all the FT2 Read block did was drop RD# low when RXF# was low, read a byte of data on the parallel pins, and pulled RD# high again. Then I encountered all sorts of timing issues since nothing was synchronized with the clock I was using (32 MHz, fed to the FPGA from an oscillator on the board). I didn't break out the pins needed to switch to the synchronous mode on the digital interface board, so I had to bandage together a fix on the FPGA. To do this, I ran the control signals through a synchronizer and clocked in the data based off of the now clock-synchronized signals. This worked, but still wasn't the most correct approach. I would recommend sticking to synchronous interfaces, and more importantly, routing out all the pins you can on the board so you don't lock yourself into a bad design decision.
The second FIFO in this design was a simple one. The input and output data widths were the same, and the read and write clocks were the same (making this a synchronous FIFO). I tried my hand at writing this one myself, which I did manage to do while searching madly through my old digital logic course notes! This FIFO allowed commands to be stored and then fed to the I2C/SPI transmit blocks in chunks of up to 256 bytes at a time, instead of the byte-by-byte flow of the interface from the FT2232.
Before starting to work on the SPI and I2C blocks, I made a breakout board for the digital interface and attached it to a logic analyzer so I could verify that I was sending data correctly. I used state machines to handle the logic behind these protocols as well as counters to divide the main 32 MHz clock into 8MHz for SPI and 400KHz for I2C. The first byte sent from the PC selects which protocol the rest of the data will be sent through. If SPI is selected, it configures a multiplexer for the SPI transmit block to send commands to either the front end or the ADC.
With all the FPGA code written, It was time for the moment of truth! In the next post I will explain how I tested the system as a whole, touch on the software needed to get the first bits of data out, and the bugs I uncovered once I got the data flowing!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
To stream the sample data as it comes it, it first needs to be converted into a format that the USB interface IC understands. USB interfaces generally come with specific applications in mind (e.g. USB to Ethernet, USB to SATA, USB to Camera Interface). If your application doesn't fit one of those niches, you must use one of the more general parts which typically use a 32-bit wide parallel interface. This is great for transferring any kind of data, regardless of protocol or type. However, our ADC does not have a matching 32-bit wide output. This means we need a device in between the ADC and USB interface to perform this conversion. A microcontroller would only take in data every clock cycle, which would require a blazing fast 1 GHz clock speed! A better choice for this task is an FPGA. These normally have a peripheral built in called a SERDES, which can take in very fast serial data and turn it into slower parallel data, acting as essentially a high speed shift register. FPGAs also have block memory which can be used to buffer the data so that small changes in transfer speed (caused by varying latency on the host PC) won't cause missing data. The FPGA can also take in commands from the USB interface and convert them to SPI and I2C commands to control the rest of the system.
The main page of the schematic serves as a good block diagram for this board. The Spartan 6 LX4 FPGA from Xilinx was selected as it was the cheapest part that could handle the 1 Gbps LVDS inputs from the ADC. It also came in an easy to solder TQFP package and had just enough logic to do the basic tasks required of it, making it a perfect fit! I was going to use the Cypress FX3 line of USB interfaces, but I ended up switching over to the FT601 since it was a good deal cheaper and came in a QFN instead of a BGA package. This made it possible to inspect and fix soldering issues. I also read this excellent little whitepaper from Numato Labs which inspired me to use a separate USB 2 interface for side-band communication (like issuing serial commands to the rest of the board) and programming the FPGA. This whitepaper also had a neat trick up its sleeve: directly connecting the USB 3 RX/TX lines, while using a hub on the USB 2 connection! This saved me from using a USB 3 hub, lowering the cost and complexity of this board.
The numbers Mason! What do they mean? This was the first board that I added reference designators to, which are numbers that visually link parts on the board with their corresponding part on the schematic. This makes assembling and testing the board easier, as do the labels on the testpoints and LEDs. This was the first board I built for this project that I had a extra set of eyes on, which helped me catch some mistakes and improve the design as a whole. Huge thanks to my awesome co-workers at the time who reviewed the board with me!
Unfortunately some issues always slip through, so don't put too much pressure on yourself to make Rev. 1 perfect! One such issue on this board was a classic pin swap - TMS & TDO on the JTAG interface to the FPGA were flipped. This made the FPGA impossible to program! It didn't even show up in OpenOCD, the software I was using to program it. I eventually fixed this by lifting both pins and soldering them to the opposite pads, a fix made possible by the TQFP package of the FPGA.
This didn't fix all of my programming woes, as for some reason the FPGA wouldn't load a bitstream from SPI flash. This meant I had to program it through JTAG every time it lost power, which made for an annoying development process. After searching OpenOCD documentation, reinstalling drivers, and pulling half my hair out, I finally looked back through my schematics and found the issue! I had accidentally given the same pin number to both a regular pin (which I was using to receive ADC data) and one of the boot mode select pins. KiCad had given the connection to the ADC priority over its direct connection to ground for setting the boot mode, leaving the pin floating during boot. I soldered on a pulldown to set the boot mode correctly, solving this issue.
The configuration block of the FPGA - basically the only pins that you can't swap!
Now that I could program the FPGA, I needed to learn Verilog to get it to do something useful. In my next post, I'll cover every block inside the FPGA, the things I learned starting to code in Verilog, and the mistakes I made along the way!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer. I'd also like to thank Hackaday and the Hackaday prize judges for selecting this project as a finalist for the Hackaday Prize, I'm very excited to continue to write about this project as I work to improve it further and hopefully make an awesome open source oscilloscope for everyone on their electronics journey!
Before designing the ADC board, I made a separate adaptor board to take in four front end channels and bring them out to only one connector. This saved board area (and a lot of cost) by cutting down on the ADC board's width, giving the two boards a T-shaped profile.
Each connection to the front ends terminated the unused AUX output of the PGA with 100Ω and set the AUX bias to mid-rail. The main outputs and bias input were routed to the ADC board connector (J1). As on the front end tester, the USB voltage was stepped up to 5.5V by a boost converter (U2) and both voltages were fed to each front end. The SPI and I2C interfaces were bused together (with separate chip selects for SPI) and routed to the connector for the ADC board as well as a debug connector (J2). An I2C GPIO expander (U1) was used to cut down on the number of connections to the rest of the system and a linear regulator (U3) was used to power it.
Live long and prosper, board. The poor thing almost didn't make it, since it was bigger than the hot plate I normally used to reflow solder boards with! But with enough hot air, the job was done.
ADC Board
The HMCAD1511 ADC is an amazing little chip! I initially selected it since it was the cheapest 1 GSPS ADC I could find through normal distributors. Turns out, it has lots of tricks up its sleeve, like digital gain. This chip interleaves eight internal ADCs to sample one, two, or four channels at 1 GSPS, 500 MSPS or 250 MSPS respectively. Taking Nyquist into account (with some wiggle room for filtering), this allows for 350 MHz bandwidth on one channel, 200MHz on two channels, and 100 MHz on four channels. Sampled data is output on eight DDRLVDS lanes, with a bit clock and a frame clock for synchronization. Since no line operates faster than 500 MHz, this type of output requires no special high speed transceivers, making it easy to interface with a low cost FPGA. All of these factors explain why this ADC is also used on almost every low-cost oscilloscope on the market!
A step-down (buck) switching regulator (U1) was used to provide the 3.3V low-speed digital IO voltage and feed the linear regulator (U2) supplying the more sensitive analog and digital 1.8V rails. To prevent digital noise in the analog rail, two ferrite beads were used on each rail and connected to the output of the regulator at only one point (star point). For the decoupling caps, I generally try to meet or exceed whatever the evaluation board uses. I made another common mistake here and put a pull-down (R8) on the active low chip select line, instead of a pull-up. This meant the chip was always selected and listening for commands on the SPI bus... doh! Another mistake I made here was using the datasheet recommended input termination (R9,C26,R10) instead of the 100Ω that the PGA expects, resulting in a weird frequency response when I tested the system as a whole (but I'm jumping too far ahead here!). I added headers (J5,J6) and RF input connectors (J3,J4) to accommodate a clock generation module that I would design later. I chose to do this to avoid designing one circuit for the ~350MHz that the first prototype would need (at USB 3 Gen 1 speeds) and then another at 1GHz for the final prototype (at USB 3 Gen 2 speeds). To be able to use an external clock generator in the meantime, I added an SMA input (J7) that fed a balun (XFMR1) to provide the ADC with the differential clock input it needed. This input was DC-biased to mid-rail by the components on the CLK_VCM net.
Loads of pretty squiggles on this one! They're pretty functional too, as they strategically add extra length on some lines so that every signal gets to its destination at the same time, preventing potential timing issues. Aside from that, that big connector on the right connects to the last board in the chain (this project's grand finale?... not even close!), the digital interface. This board can't be tested until the digital interface is built, so fingers crossed it all works!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!
To evaluate the front end as a whole, I made a tester board for it. This board brought out the analog outputs from the PGA to SMA connectors and included a socket for a Teensy (microcontroller dev board, similar to an Arduino nano) to interface with the PGA, DAC and control signals.
Aside from those main tasks, this board also stepped up the USB voltage to 5.5V, which was regulated back down to 5V for the PGA. This was done because USB VBUS voltage can drop as low as 4.75V, which is the minimum operating voltage for the PGA. When a linear regulator is added to "clean up" the voltage rail, the regulator's dropout voltage could cause the output to be below 4.75V, which the PGA isn't specified for. The common mode voltage for the PGA is also set on this board, for some reason I set this to 1.25V instead of the 0.9V needed for the ADC, but luckily enough I added an option to use the Teensy's DAC to set the voltage instead. In a classic blunder, I reversed the order of the GPIOs on the connector, but this could be flipped back in code.
Did I specify the Teensy just because I could get it in matching OSHpark purple? Yes, yes I did.
With all the boards connected, the tests could begin! As expected, the PGA (with its 900 MHz of bandwidth!) didn't impact the frequency response much. However, I noticed a flaw in my plans for the overall system when testing the values from my spreadsheet.
Although the PGA was supposed to drive the ADC directly with about 1.8V full scale, it wasn't able to drive more than 0.9V at higher gains!
This was, of course, in the datasheet! Tempted to repeat the mistake I made with Rev. 2 of the FFE board, I frantically scrambled to think of a solution. I looked for an ADC driver with enough gain to allow the PGA to output the 0.7V it was actually designed for, and give the ADC the 1.8V it needed. But having learned my lesson at that point, I took some time to really read over the ADC's datasheet. I found out it had built in digital gain that could be used for this task! The ADC actually samples at a higher than 8-bit resolution internally, so at low gains (in this case, ~2.5x) this feature had very little cost to overall performance, as the resolution with the gain applied was still greater than 8 bits. This enabled an entirely new architecture for the FFE! Until then, I elected to continue designing the rest of the system before going back to make the new front end. Next up, the ADC board!
Thanks for giving this post a read, and feel free to write a comment if anything was unclear or explained poorly, so I can edit and improve the post to make things clearer!