Save power by aggregating I/O

Intel bunches up NICs

WITH EFFICIENCY GAINS at the CPU level becoming harder and harder to find, a lot of work has turned to the platform to squeeze out a few more watts here and there. During Research At Intel Day (R@ID), Intel was showing how to bunch I/Os intelligently for large system power savings.

The idea is simple, you take the CPU power savings strategy of “hurry up and go to sleep”, and apply it to NICs, either wired or wireless. On a CPU, the idea is to do as much work as possible as fast as possible, then go to sleep for as long as possible.

The more work you can do in a shorter time period, the longer you can sleep. Sleep saves power, so you can theoretically save net power by burning a bit more to get the work done, then sleeping longer.

On a network card, the idea is the same, but the implementation is a little more tricky. The NIC is only one part of the equation, there is another end of the wire too. This isn’t to say it’s impossilbe or that the designers have no tricks, just that there are more barriers to work around.

Batching up IO

It looks like this

Intel’s implementation is simple, you gather up as many packets as you can, and then fire them all off at once. As soon as you are done, the NIC goes to sleep for as long as it can saving power. This saves a little power on the NIC side, but it is not the major gain.

Normally, the NIC then sends packets to the system, and the CPU has to wake up, process them, and go back to sleep. Since the CPU needs time to go to sleep and then wake up, regular I/O will result in a sawtooth power use chart.

Normal loading

Blue is high power, green low, red is highest

As you can see above, when a system is getting packets regularly, it spends most of it’s time in a high C-state burning power. The blue line is a fairly high power mode, the green is a low power sleep mode. The PC doing this demo is playing a streaming video in a loop, a fairly typical laptop behavior.

Batched IO loading

Bunching I/Os means more sleep

When the I/Os are bunched up, the resultant mix of sleep vs awake is almost flipped, the CPU spends about 70% of it’s time in the green sleep state. The blue and red power states make up about 20% of the power used with the I/Os bunched vs 75% or so without any intelligence on the I/O side.

The current implementation is all in software, the NIC simply waits as long as it can to wake up and send data, and when it does, it blasts out as much as it can as fast as it can. This ‘bursty’ behavior allows it to sleep more, and the CPU can do the same.

This can be accomplished at the NIC level through simply watching timing and packet sources. You don’t need to peek into the packets, just track their frequency, and make a few assumptions based on how often they come in. For now, this can all be done in software or firmware running on the NIC itself.

Expanding on this idea, you could modify the hardware itself to be more amenable to the whole concept of bursting transmissions. Things like deeper buffers and packet classification hardware could lead to even more power saved downstream for a small number of additional transistors used on the NIC. Hardware changes can take it from good to better.

For now, it is pretty clear that the concept has merit. Delaying a packet by milliseconds will be unnoticeable to the end user, but may result in some fairly substantial power savings. Given that the additional cost is very low or zero, we can see this technology spreading fast. Hopefully, the next 802.11 variant will have this or a similar technology as part of the spec when it arrives.S|A

The following two tabs change content below.

Charlie Demerjian

Roving engine of chaos and snide remarks at SemiAccurate
Charlie Demerjian is the founder of Stone Arch Networking Services and SemiAccurate.com. SemiAccurate.com is a technology news site; addressing hardware design, software selection, customization, securing and maintenance, with over one million views per month. He is a technologist and analyst specializing in semiconductors, system and network architecture. As head writer of SemiAccurate.com, he regularly advises writers, analysts, and industry executives on technical matters and long lead industry trends. Charlie is also available through Guidepoint and Mosaic. FullyAccurate