The first version of my open-source OpenCV–compatible FPGA Stereo Correspondence Core is now available!
(have a look at my previous FPGA Stereo Vision Project post for some more context)
It’s written purely in synthesizable Verilog, and uses device-agnostic inference for all FPGA primitives (though the current implementation is more optimized for Xilinx devices). I’m releasing it under a standard 3-clause BSD license.
The design is heavily pipelined. Under realistic conditions (in a highly-utilized, slowest-speed-grade part without any floor-planning), it can run around 150 MHz in Spartan-3E and Spartan-6 parts, and around 300 MHz in Virtex-6 parts. Much higher speeds (50+%) are possible under unrealistic (ideal) conditions.
The design is fully parameterized and highly scalable; some example implementations include:
The core has been verified in simulation using Verilator with SystemC testbenches. Post-synthesis results (from Xilinx’s XST tool) have been verified using a simplified Verilog testbench and Xilinx’s own ISim simulator.
Now, before everyone runs off and tries to build their own open-source Kinect, I must stress that this isn’t a complete solution just yet; here’s a block diagram of what I have implemented:
Stereo Vision Core block diagram
If we now refer back to the high-level block diagram that I presented before:
High-level system logic diagram
..we can see that this core implements all of the “Stereo Correspondence” block, some/all of the “Post-Processing” block and (had I actually included it on the original diagram) the “Pre-Filtering” block. While “Image Rectification” is the only significant missing image pipeline component, there’s still a lot of other system level infrastructure to develop (external interfaces, buses, etc.) before I can call the project “complete.”
That being said, the correspondence core easily represents the most critical, most resource-intensive and highest-performance component of the entire system. Completing it is a major milestone in the project.
As alluded to in a few of my other posts, I’m working on developing an open-source FPGA-accelerated vision platform. This post is a detailed overview of the project’s architecture and general development methodology. Future (and past) posts will elaborate on specific pieces of the project, as they’re implemented.
Stereo-vision is the main objective for the project – but once the general framework is in place, an obvious next-step would be the offloading of additional vision algorithms onto an FPGA.
(Update: 2011-06-10: I’ve now released the first version of my Open-source FPGA Stereo Vision core!)
This post is rather long, even by my standards. Here’s an index:
Stereo vision system diagram
Above is an approximate diagram of how the initial implementation will be organized. Key hardware components include:
- Multiple images sensors (provided by my MT9V032 LVDS camera boards)
- FPGA with PCIe interface and external high-speed memory (provided by a Xilinx SP605 Spartan-6 FPGA development board)
- Interface board to connect the image sensors to the FPGA board (provided by my FMC-LPC to SATA adapter board).
- Host computer (a generic PCIe-capable x86/AMD64 PC running Linux)
Posted in FPGAs, Robots, Technical
Tagged FPGA, Linux, MT9V032, OpenCV, PCIe, ROS, Spartan-6, Verilog, vision, Xilinx
Eventually, when my FPGA stereo-vision project nears its terminus, I’m going to want to produce a refined sensor board that combines the image sensors and FPGA onto a single board. In preparation for that, this board is a test vehicle to investigate what it takes to design and assemble a compact PCB with multiple BGA packages using only tools and services that are within the reach of a well-equipped hobbyist.
Spartan-6 BGA test board - top components
Major features include:
- Spartan-6 FPGA in FT256 package (up to an XC6SLX25)
- 64MB 800Mbps x8 DDR2 SDRAM
- 18 high-speed LVDS pairs for FPGA expansion (across 9 “SATA” connectors)
- 16 low-speed 3.3V signals for FPGA expansion (on a Gadget Factory “Wing” style header)
- 100 MHz oscillator
- ATmega32U2 USB microcontroller (responsible for configuring the FPGA via JTAG)
- 2MB SPI flash for non-volatile bitstream and data storage
- Single 5V supply (onboard regulators for 3.3V/2.5V/1.8V/1.2V rails)
- Single JTAG port selectable between AVR and FPGA
- 85mm x 50mm 4-layer PCB (3.35″ x 1.97″)
Looking at the top two items on that list – an FPGA in a 256-ball 1.0mm BGA package, and a memory device in a 60-ball 0.8mm BGA package – one can easily imagine that assembly is going to be the trickiest part of this project.. but this post isn’t about the assembly of the board, seeing as I’ve only just sent it off to be fabricated (this time by Laen’s 4-layer PCB service). I’ll make a follow-up post once the board is back and assembled.
Instead, this post is entirely about the design and layout of the board.
Reflow soldering is not new. The electronics industry has been using it forever. Hobbyists have been flocking to it in droves. Many use toaster ovens. A growing contingent use skillets. A few do it open-loop. Some use integrated PID controllers. A handful reflow both sides. Quite a lot use stencils. Others forgo that luxury and manually apply paste. Just a few reflow BGAs (and the exceptionally skilled reflow BGAs by hand).
This is my addition:
Reflow Toaster Oven
Posted in Microcontrollers, Technical, Tools
Tagged AVR, GUI, MAX6675, microcontroller, Python, Qt, reflow oven, soldering, solid-state relay, thermocouple
Another piece of my ongoing FPGA stereo-vision project. This board is, as the name suggests, a breakout board for Aptina’s excellent MT9V032 1/3″ VGA image sensor.
MT9V032 camera board - assembled
The board’s main purpose in life is to connect the LVDS output of the MT9V032 sensor to my FMC-LPC to SATA adapter board, which would then route the LVDS data into one of Xilinx’s Spartan-6 FPGA development boards. Multiple camera boards would be connected to support stereo vision.
There’s more to the board than simple signal breakout, however.
Here’s a board I designed back in 2008. It’s an excessively feature-packed brushless DC motor (BLDC) controller.
Brushless motor controller
..And all of that fits onto a single low-spec (BatchPCB compatible) 2.5×1.5″ 2-layer PCB.
Posted in Microcontrollers, PCBs, Robots, Technical
Tagged ARM, BLDC, EAGLE, H-bridge, microcontroller, MOSFET, motor controller, PCB
I recently bought a shiny new Xilinx Spartan-6 FPGA SP605 Evaluation Kit:
FMC-LPC to SATA adapter board - installed on Xilinx SP605 board
It’s an excellent board (aside from the ridiculous ~10W idle power consumption and the corresponding supply heating/temperature) – PCI-Express, gigabit ethernet, DVI, 1.6 GB/s 128 MB DDR3, and a large (43,661 equivalent logic cells and 2,088 Kbits of block RAM) XC6SLX45T Spartan-6 FPGA? Yes, please.
At first glance, it has one major failing: a complete lack of user-friendly I/O expansion. Xilinx has put all of their eggs in one high-density, surface-mount basket: the board has ~70 FPGA I/Os brought out to a single high-speed FMC connector.
FMC-LPC to SATA adapter board - bottom
Not the most friendly looking footprint, right? While it’s no 100-mil pin header, it’s remarkably easy to work with – even on a simple 2-layer PCB.
Yes, it’s actually possible! – in Verilog and VHDL, even.
I’m a big fan of inference, especially as it applies to writing synthesizable Verilog code for FPGAs. Properly coded, a module that infers technology-dependent blocks (e.g. block RAMs) should: be portable between devices from a particular vendor (e.g. Spartan 3E to Virtex 6), be portable between devices from different vendors (e.g. Spartan 6 to Cyclone III), and even be portable to vendor-independent environments (e.g. simulation in Icarus Verilog).
The trick is that little “properly coded” clause. Figuring out exactly the right sort of Verilog required to get a particular tool to infer the block you want isn’t always straight-forward. Figuring out exactly the right sort of Verilog to get multiple tools to infer the block you want can be even trickier. Which is, perhaps, a little bit silly – considering that the whole point behind this little exercise is to be able to write code that isn’t tied to any particular tool, device or vendor!
Using current synthesis tools from Xilinx (ISE WebPack 12.2) and Altera (Quartus II Web Edition 10.0 SP1), it’s now practical to write synthesizable device and vendor-independent Verilog code (or VHDL, if that’s your thing) that properly infers true dual-port (TDP), dual-clock block RAMs in each vendor’s respective FPGAs. There are, of course, a variety of limitations and caveats that come along with that statement.
It has now been just over 6 years since I launched the last incarnation of my website. Finally, it has now been supplanted by this – a site that can safely be regarded as superior in virtually every way (if for no other reason than its tentative embrace of the buzzword-laden Web 2.0 – yes, it is now possible to comment on my content. How novel).
Many of my past projects are now well-documented here (of particular interest: Eddie, a Mars-rover inspired autonomous robot; and Elysium, a presentable solid-state Tesla Coil). Some others are only glorified photo-galleries at the moment; these will be expanded upon in due time.
Speaking of photos, I’ve picked up another hobby in the past year: photography. As a result, expect many more and better photos of my past (those that I still posses) and ongoing projects. These will make their way to my Flickr photostream before venturing here (and often, due to sheer volume, in lieu of ever appearing here).
Anyone that is familiar with my previous attempts at websites will be skeptical of my ability to make updates in a timely fashion (that is to say, more frequently than once per leap-year). I’m hoping that this blog format will encourage me to make more frequent postings about whatever it is that I’m currently working on (be it a project, some photography tidbit, or even just a hike I’ve been on). We shall see.
In the meantime, enjoy what I’ve posted thus-far: