Hot Chips talks all about chip stacking, good and bad

Hot Chips 2012: 2.5D, 3D, and 3D+ are all not quite here yet

Sep 5, 2012 by Charlie Demerjian

Hot Chips 24 had their usual tutorials on the first day, and this year the second topic, chip stacking, was of particular interest to SemiAccurate. There were five companies, AMD, Amkor, Qualcomm, UMC, and Xilinx giving seven presentations which we will summarize and analyze.

The idea behind chip stacking is an easy one, take multiple small dies that are easy to make and combine them in to one unit. You can stack chips from different and incompatible processes, have massive die to die bandwidth, save power, save cost, and do things that are flat out not possible with a large monolithic die. In short, it may not be silicon nirvana, but it is as close as we will come in the near future.

Unfortunately, the problems are as severe as the benefits are positive, if not worse. In fact, the most important problem, we can’t actually make a 3D package with high wattage silicon, is quite the show stopper. From there, the near complete lack of design tools, nonexistent standards, and no industry knowledge base are nearly as crushing. That said, innovation and progress are moving things along, and last year, Xilinx came out with the first real high wattage 2.5D device, a four die FPGA on an interposer. This might be the reason the company had three talks at Hot Chips, they are doing it, others are talking.

Types of stacks from Qualcomm’s presentations

One of the first problems that you run in to when talking about stacking is simple, terminology. In recent months, that has become much less of an issue, interposers are now considered 2.5D, die on die is generally thought of as 3D. As you can see from the picture above, it can get complex from there with multiple types in a single package. In the end, those two definitions are the ones that really matter.

So, why do we need stacking? What does it do that a single monolithic die can’t do better? There are two main trends coming about because of Moore’s Law, I/O density and power use. Couple that with the economic reality that bigger chips are harder to make, and you are headed toward a very bleak future. Transistor density may double every 18 months, but we are at the point that they don’t actually do anything useful. Why? Lets take a look.

The first problem is I/O density, commonly referred to as pins or bumps, has to increase in proportion to the transistor count. The latter doubles every 18 months, the bumps go up at a much slower pace. Bumps are tricky to design, they are electrical contacts, thermal conduits, and mechanical attachment points, all in a single chunk of solder far thinner than a human hair. If you engineer them right, all is well. If you don’t, you have catastrophic failures, just ask Nvidia.

I/Os per 1000 logic cells for Xilinx FPGAs

As you can see from the above slide presented at the Xilinx talk, the I/O count per 1000 logic cells has gone from 364:1000 to 2:1000 over the last 15 years. While each I/O may have gotten faster during that time, so has each logic cell. Modern chips, be they DSPs, SoCs, CPUs, or GPUs, are starved for I/O.

Even if you can physically fit that many bumps on the bottom of the die, you have to route them all through the package, the board, and to things around the socket. The longer you make the traces, the slower they go, the more noise they generate, and the more power they consume. If you can make things short, you win. More traces to route means more length, board layers, cost, and headaches.

That brings us to the next problem, the power consumed by the I/Os obviously goes up with their count, all other things being equal. If you can slap 10x the memory pins down, you will use 10x the I/O power for memory. The power of each I/O is also proportional to the RC (Resistance Capacitance) of the link and it’s speed, so faster and more complex routing equals more power. After a point, more speed simply burns more power than adding another pin.

Then the is the problem of simple chip building economics coming to bear. Yield goes down roughly in proportion to the square of the die area, so a 100mm^2 die will have 4x lower yields than a 50mm^2 die. There are lots of mitigating factors, but the basic idea that bigger is more than linearly worse than smaller is unavoidable. It isn’t hard to see how four smaller dies could make more sense than one bigger die.S|A

Note: This is Part 1 of 2.

Bio
Latest Posts

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate

Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate

Latest posts by Charlie Demerjian (see all)

Intel Bumps Xeon Memory Speeds - Jul 20, 2026
Qualcomm’s 2026 Investor Day Has Hidden Goodies - Jul 15, 2026
Phison Phudges The Numbers… In A Good Way - Jun 24, 2026
Who Are The Qualcomm Enterprise Customers? - Jun 23, 2026
Innodisk Adds 10GbE To Useless M.2 Slots - Jun 19, 2026

Thank you, Subscribers!

Thank you to our Subscribers, past and present.

You are appreciated.

You are what keeps SemiAccurate going, what allows us to maintain our journalism, what keeps us ad-free, what allows us to tell it like it is, it is still just you. You, the reader and subscriber, we thank you.

If you want to know more about subscriptions, both free and paid, the information can be found here.

For more on our track record of leading edge journalism see Fully Accurate.
Our Writers

Charlie Demerjian is the founder of Stone Arch Networking Services and S|A.

SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, security and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture.

As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends.

Thomas Ryan is a GIS Programmer and freelance technology writer from Seattle, WA. You can find his work on SemiAccurate and PCWorld.

Semiaccurate

On Target Technology News

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

Hot Chips talks all about chip stacking, good and bad

Hot Chips 2012: 2.5D, 3D, and 3D+ are all not quite here yet

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)

Hot Article AMD to differentiate cores

Hot Article Intel foundry customer bails out

Hot Article Coffee Lake is going to impact Intel’s margins

Hot Article SemiAccurate digs up Intel Coffee Lake specs

Hot Chips talks all about chip stacking, good and bad

Hot Chips 2012: 2.5D, 3D, and 3D+ are all not quite here yet

Charlie Demerjian

Latest posts by Charlie Demerjian (see all)

Share this: