What’s cooking?

Walking into a customer’s lab recently I detected the tell-tale odor of insulation cooking.  One of the engineers I was with said “It always smells like that.  You get used to it.”  It shouldn’t.  Not a good sign.  High speed switches, fast network interfaces, and dense processing nodes all dissipate a lot of power as heat.  FPGAs are notoriously difficult when it comes to estimating power dissipation.  When heat is not adequately removed from a system we know the system will fail at an accelerated rate.  It is very important to consider and have a plan to manage thermal issues up front in any new design. 

Heat can be transferred via three mechanisms: conduction, convection, and radiation.  Conduction occurs in solids and the heat is transferred via motion of the molecules.  Convection is the transfer of heat via a gas or fluid so by definition it is not possible in a solid.  Radiation is the transfer of heat via electromagnetic waves.  The sun warms us via radiation.  In a packaged chip we get conduction from the die to the surface of the package and then convection from the package surface to the surrounding air.  The amount of heat transfer from the case to the air via convection is dependent on airflow.  It is important to note that, for both conduction and convection, temperature is directly proportional to the power dissipation and inversely proportional to area the heat is flowing through. 

In the picture below I’ve depicted a BGA package– the balls are the little circles on the bottom, a chip (clear rectangle) inside a package (crosshatched area) with a heat sink stuck on the top.  A model of the thermal impedance is drawn to the right. 


Simply put, thermodynamics tells us that:

                        Rja = Rsa + Rcs+ Rjc = (Tj-Ta)/Q

 Where Tj is the junction temperature,  and Ta is the ambient temperature.  Both are in degrees Celsius.  Q is power dissipation in Watts. Rja is the thermal impedance from chip junction to ambient air.  Rsa is impedance from the surface of the chip to ambient air, and Rjc is impedance from the junction to the chip. 

Typically, for telecom type systems, the max Ta is 50 degrees C plus a 10 degree C rise inside the box.  A typical semiconductor Tj is 115 degrees C.  Substituting in these values allows us to calculate the max Rja for different levels of power dissipation.   You can see that as the thermal impedance Rja increases the allowable power dissipation drops quickly, assuming constant Tj and Ta.

I recommend measuring case temperatures right on the chip package and then using the chip manufacturer’s specified Θjc ( I use the notation Rjc here) to calculate the junction temperature.  A good DMM with a temp probe can give reasonably accurate case temperatures in a pinch.  There are also thermal cameras and handheld infrared heat detectors with laser sighting for those with a bigger budget. 

As an example, a typical  Rjc for a BGA package is 0.18 C/Watt.  Lets look at the case temperature and see what we can conclude about the junction temperature inside.  We know that:

Rjc = (Tj-Tcase)/Q

0.18 = (Tj –Tcase)/Q

Tj = Tcase + 0.18Q

You can see that the junction temperature will track the case temperature closely (within a degree) for power dissipation of 5W and under. Above 5W it gets difficult to get the heat out of this package so the die temp is gonna rise and the failure rate will increase.   Working through, and managing these issues up front in a design cycle will save you from a lot of headaches and a smelly lab. 

Rapid Development with HW Building Blocks: System on Chip

First in a series of posts……

When discussing rapid development of complex embedded systems the issue of using third party hardware and software building blocks inevitably comes up.  There are a lot of options and issues when it comes to using embedded system building blocks.  Nobody wants to re-invent the wheel but picking the wrong wheel can overturn your chariot!  In this series of posts we will look at some options and issues for speeding up hardware development using different building blocks.  In a later series we will look at some of the software issues.

Semiconductor vendors offer an amazing range of System-On-Chip (SoC) devices.  Choosing the right SoC can really shorten development time if most of what you need is already in the silicon.  A critical area that is sometimes missed though is an analysis of how your system’s desired performance running your application compares to the SoC’s likely performance running your application in your system.  What parts of an SoC you use and how your traffic flows affects performance.  How your system is physically partitioned and how you interface the SoC to your system can also have dramatic effects on performance.  The thermal environment in your system can impact how much performance you can squeeze out of an SoC.

If you develop an FPGA based SoC for your system you can use some pretty powerful low-cost vendor provided design tools and no charge (nothing is totally free my friend!)  FPGA vendor cores.  These cores can be a big time saver in chip design.   The FPGA vendor has  knowledge of its cores, tools, and silicon so they can help with development problems at a deeper level than a third party core vendor.  Also, your interests and the FPGA vendor’s interests are aligned.  They want you to get to market as fast as possible so they can make money too.  Be aware FPGA vendor core licenses often specify you can only use them in the vendor’s chips.   This can present problems if you want to port your design later.  It’s best to study the license details and be up-front with the FPGA vendor on what you plan to do.

In our next post we will look at the use of hardware reference designs.

Tradecraft: Source Synchronous Bus Timing Problem

Interfacing to a source synchronous bus with an FPGA can be a bit tricky.  Today’s FPGA tools provide lots of resources to help achieve timing closure inside the chip.  Sometimes though, the FPGA needs a little help external to the chip to meet timing.

I recently ran into a case where an FPGA connected to an Ethernet PHY over a GMII bus.  The GMII bus uses simplex point to point connections for transmit and receive.  The source end of the connection drives the clock.  Both the clock and data for the GMII “Transmit” connection were sourced from the FPGA.  Close inspection of the FPGA timing report revealed that the data could come out 118 picoseconds before the clock came out. It is important that both the setup and hold time requirements of the PHY be met.

It was necessary to ensure the clock arrived at the PHY before the data lines changed (zero hold time) and that the PHY’s setup time requirements were met.   The easiest way to do this was to delay the data lines a small fixed amount by running them as longer nets than the clock line.   This ensures most of the clock period for setup time at the PHY and still maintains a 0 ns hold time.    The exact board propagation delay number is based on the dielectric constant of the circuit board and can be calculated by a circuit board vendor given the details of a board.   My general rule of thumb for delay in a stripline trace is about 165 ps/inch.  Sometimes a little propagation delay is a good thing!

Why It’s Best to Use a Bootloader for Your Product

Xilinx has a technology called SystemAce that allows you to boot Linux directly from a Compact Flash device.  People see this and ask us: “why use a bootloader in my project?”   The short answer is that, while it’s quick and easy to boot directly from CF for development and demo purposes, for production there are other things you should consider.

Key issues for production hardware include:

  1. How will you set permanent information such as per board Ethernet MAC addresses and serial numbers?  This needs to be quick and simple for a manufacturing environment.  Clumsy scripts and hokey workarounds bring no end of headaches here.
  2. How will you store persistent information, such as static IP addresses, that may change occasionally but must be recalled after a power cycle or reset?
  3. How will you upgrade your operating system?  Do you need to do this in the field?
  4. Do you need diagnostics?  How will whoever manufactures your device test it?  Typically, manufacturing diagnostics are required to support debug of faulty units for any processor based system.  Also, some devices need to run a power-on-self-test (POST) before becoming operational for safety or other reasons.
  5. Will you ever need to change system behavior at boot time?  For example, would it be handy to easily switch between an NFS mounted and a RAM based root file system to debug certain problems?

A well developed, flexible bootloader, such as U-boot,  provides ways to bring these important features to a product.

The Advantages of FPGA Based SOCs

In previous posts we’ve talked about how an FPGA based SoC in your system allows hardware changes to be made late in the design cycle or even in the field thus helping reduce time-to-market and increase time-in-market.  In today’s post we talk about some of the advantages an FPGA based SoC provides the system designer.

Most embedded electronic systems are composed of processors, peripherals, memory, and lot’s of software. Today’s system-on-chip (SoC) FPGA based systems provide significant advantages in processing and in how peripherals are used.

Today’s SoC FPGAs offer many options to customize processing so you can optimize performance, power dissipation, and cost in your system.  The first decision usually confronted is the choice between a hard-core versus soft-core processor.   Performance is often the key issue but cost is always a consideration of course.  Most hard-core processors are targeted at higher-end, pricier, FPGAs.  Soft-core processors can be used in less expensive FPGAs as well as the higher performance/higher priced silicon. Hard-cores provide higher performance but are tied to a particular chip or chip family that may be discontinued by its manufacturer some day.  For long-lived systems this can be a huge problem.  For some applications the lower performance of a soft-core processor is preferable because it can be placed in a new chip when an old one goes end-of-life.

Selecting an SoC FPGA based processor is not as simple as just comparing MIPS between two processors though.  SoC FPGAs are capable of implementing a variety of distributed processing architectures.   Processors embedded in FPGAs today can communicate via high performance parallel or serial busses.  Soft-core processor performance can often be enhanced with the use of co-processors to perform critical tasks.  Most packet processing today is done with multiple specialized processors running in parallel for example.  At Black Brook Design, we’ve implemented different types of co-processors from simple state machines to complex micro-programmed sequencers.  The options available to system designers today were only dreamed of ten years ago.

For many signal and image processing applications the ability to perform complex algorithms in high speed hardware is essential.  Such applications can often use a lower performance “control plane processor” to manage the system while running the critical algorithms in FPGA gates at high speed.

Optimizing and customizing the processor’s peripherals is another important area of concern.  System designers can save cost by integrating as many peripherals  as possible in the SoC FPGA.   In most cases it’s relatively easy to change the numbers and types of peripherals as well as modify peripherals to allow things like custom test functions, loopback logic, packet snooping, etc.  And perhaps most important, is the ability to add custom peripherals.

All this capability comes at the cost of increased complexity and risk though.  It’s our old friend the customization/complexity coin from an earlier post.  With so many options in hardware and software it’s easy for development schedules to go “off the rails”.  A soft appliance based solution solves this problem by providing an off-the-shelf  FPGA bitstream with the processor(s), many common peripherals, and the software pieces needed to  boot and run Linux– all integrated together.  Whether it’s based on a hard or soft core, getting a control plane processor to boot up and communicate is a great way to jump start development.  Your development resources can be focused on the real value add in your system– application development– instead of working to solve low level hardware and software issues.

Donald Rumsfeld On The Need For FPGA Based Soft Systems…

“There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know. “  -Donald Rumsfeld

OK, I took a bit of liberty here.  He wasn’t really talking about soft systems but his quote is perfectly applicable to any complex undertaking– such as designing an electronic system.  I read an article recently by Xilinx CEO Moshe Gavrielov in Electronics Weekly and it made me think of Don Rumsfeld’s famous quote.  Moshe wrote about how the roll-out of 28 nm FPGAs in 2012 will accelerate the trend toward FPGA based soft systems.   His main point is that economic and technological trends are leading to “the programmable imperative”.   To me though, what matters most  is that chip-level  integration plus programmability means we can put a system on a chip and we can change it when needed.  It’s really a new level of flexibility in the product development process.  That’s what got me thinking about Don Rumsfeld’s quote.  When you hit an unknown unknown during product development the flexibility of a soft system can save the day.

The Lessons US Robotics Can Teach Us

US Robotics rise in the modem market during the 1990′s provides an interesting case study in the benefits of building your product around a programmable technology.  At the time of the company’s founding the major modem providers developed custom chips to perform the signal processing required in the latest version of the ITU modem standard.  The standard changed periodically to increase the baud rate as new signal processing algorithms were developed.  Changes to the standard kicked off a race to deliver new custom chips to accommodate the changes.  This was an expensive and time consuming process.

The folks at US Robotics took a different road.  They built their modem around a relatively inexpensive DSP chip from Texas Instruments.   They programmed the modem algorithm in firmware.  This freed them from the costs of developing, testing,  and stocking custom modem chips and allowed them to quickly respond to changes in the modem standard. It proved to be a brilliant idea.

By the mid 1990′s it was commonly expected that US Robotics would be first to market each time the modem standard changed.  Being first to market with a new higher speed modem was a huge advantage because everyone wanted higher speed.

Because they used a programmable technology in their product US Robotics could offer their customers a firmware upgrade to the higher speed as opposed to swapping out hardware.  This proved to be very important to their largest customer AOL.  AOL bought modems by the truck-load and spread them all over the US so customers could dial-in to their service.  To AOL changing hardware was a logistical nightmare but changing a programming file was relatively easy.  US Robotics assured AOL there would be no wholesale hardware replacement when the standard changed and they locked in the business.   Being first to market and having the longest time in the market are significant business advantages.  US Robotics exploited these advantages brilliantly.

Today we have FPGA-based soft systems on a chip that provide businesses these same advantages.  What’s even better is that they offer the advantage of changing hardware by re-programming the chip.  Hardware has become softer.  Standards still change, customer requirements change, and engineers still create bugs.  Basing your product on a soft system is the best way to deal with the inevitable changes that will arise and gain a strategic business advantage.

Customization and Complexity

In a world where the latest technology is available to everyone how do you differentiate your product? The answer is customization. Provide the features and performance that your customers need faster and at a better price than your competitors and you will win.  Customization is the promise of today’s FPGA technologies.

Hardware used to be fixed by the chips chosen for your circuit boards.  Changing the hardware was expensive and time consuming.  As chip technology advanced integration reduced the number of items on the circuit board from a microprocessor with peripherals on a board to a microprocessor plus some fixed peripherals on a chip.  Custom logic was implemented either in an ASIC or a companion FPGA chip.  Analog circuitry was often relegated to a separate circuit board.  Today we have high capacity FPGAs that can be configured to contain one or more microprocessors, your choice of peripherals, and custom logic for your application.  Not only are the numbers and types of peripherals customizable but the peripheral cores themselves can be modified  in some cases.  You can mix and match hard and soft  processors to meet your performance needs.  System designers today can also take advantage of chips  such as the SmartFusion line from Microsemi and the PSOC family from Cypress Semiconductor that contain a microprocessor, your choice of peripherals, custom logic, and your choice of analog circuitry.  The trend is clear– the number of items on the bill of materials is decreasing due to advances in semiconductor process technology.  This trend reduces product cost, simplifies manufacturing,  and reduces product size (real estate reduction) as well as inventory costs (less chips to stock).

So what’s the problem?  Things sound great, you can customize one or more of today’s chips to provide exactly what you need for your application and reduce costs.   The problem is complexity.  Assembling and configuring various blocks into a complete chip design opens a myriad of low level details such as bus protocols, bus widths, clocking issues, reset issues, software/firmware issues, interrupts, and timing constraints.  Some have dubbed this the blank slate problem.  Designing a system on a chip from scratch is not for the faint of heart.

Programmable logic vendors have invested huge amounts of money in making their tools easier to use with limited success.  The problem is a difficult one.  How do you make the customization capability accessible but keep the design process simple?  Customization and complexity are two sides of the same coin.  T.J. Rodgers from Cypress Semiconductor once described the process of creating his PSOC product family as “solving problems you didn’t know existed for people you haven’t met yet”.  Imagine the difficulty of creating the tools  to do that.  Finding bugs in an FPGA vendor’s tool is annoying but I try to remember the customization/complexity coin and take a moment to reflect on the big picture.  It helps.  We have the opportunity to create amazing custom technology today thanks to the efforts of these folks.

T.J. Rodgers and the Rise of System-On-Chip

Cypress Semiconductor announced that they have sold over a billion of their PSOC (Programmable System on a Chip) family devices since 2002.  When you’ve sold a billion devices it’s not a niche product anymore.  Kudos to T.J. Rodgers and the folks at Cypress Semiconductor for delivering a great product.  We’ve been using their stuff for a wireless temperature sensing application in our lab and it works great.

Cypress’ PSOC family  is one example of how programmable hardware devices are changing the way embedded systems are designed.  There a variety semiconductor vendors offering devices that illustrate this trend, each targeting different parts of the embedded electronics market.  Offerings range from lower-end processors  combined with a relatively small number of  digital logic cells and analog functionality (PSOC) to devices with huge numbers of digital logic cells combined with high-end processors capable of running Linux (Xilinx’s Zynq).

The value proposition of programmable system on a chip devices encompasses reducing a product’s bill of materials through greater integration and enabling product differentiation through cost effective customization.  For many years products in the embedded industry used ASICs to reduce component count and allow customization.  As ASIC process nodes have shrunk the costs of developing an ASIC have skyrocketed and the number of new ASIC design starts has fallen steadily.  The reason is economics.  ASIC design at the leading edge of semiconductor process technology is very expensive.  ASIC development costs can run upwards of $40 million.  If we use a typical business model where around 20% of revenue is spent on R&D a chip costing $40 million must bring in around $200 million in revenue.  If we are lucky enough to own 50% of  the market for our device that means the total market is worth $400 million.  Outside of products like PCs, cell phones, and game consoles there aren’t a lot of markets that big.   Contrast that to a programmable chip, which is expensive to develop, but once developed can be programmed to serve many different markets.  The economics mean that  programmable technologies will continue to see increasing adoption while fixed function devices, such as ASICs and Application Specific Standard Processors (ASSPs), will see less.  In our next post we’ll talk a bit about product customization and managing complexity.