Netronome is finally announcing their first chip family built on Intel’s 22nm process, the NFP-6xxx line. This is the spiritual successor to both Intel’s old IXP chips and Netronome’s earlier NFP-32xx parts.
We told you a bit about the NFP-6xxx chips last April, but the specs and name were not released at that time. Both were revealed last week along with many more details. These parts look very interesting, and Netronome is quite serious about using the process advantage they have to do what the competition hopefully can’t.
If you recall, the 32xx generation of Netronome CPUs was built on TSMC’s 65nm process and can support 8 million flows per chip. The new generation part aims to support over 10x that, so it needs to support over 200Gbps of full duplex data flows. By any measure that is a lot of data to pass along, much less to move and twiddle bits at the same time without hammering latency. If you look at the older reference architecture, you can see that there are more busses and internal connections than just about anything else.
The old architecture
You may notice that the reference architecture is closely coupled with Intel CPUs, not a big surprise because Netronome was spun off from Intel a few years ago. While the new 6xxx line can still be coupled with Intel CPUs for more added processing power, they are now targeted at fully standalone operation. What the customers do with them is their call, but systems with and without an x86 CPU are equally feasible now.
One feature that makes standalone operation easier the NFP-6xxx line is KR compliant so they can plug directly in to a backplane. The chips also support direct optical converters, so no need for external hardware other than that necessary to change the format. All told, a single 6xxx part can support 48 10GbE ports directly connected to the device.
When talking about companies like Netronome that make networking or other low level chips, the clearest differentiator between those with cool hardware and those with cool hardware that succeeds is software. Netronome was quite adamant that they have some of the best software out there, from low level tools like a C compiler and a full profiler to higher level packages. Making a widget from a Netronome chip isn’t a turnkey experience, but the company says it is pretty darn close. Given the complexity of the chip, providing robust software is a must, and it sounds like that is what you actually get from Netronome.
So what does the NFP-6xxx do? Software Defined Networking(SDN), the paradigm behind OpenFlow, is the quick way to describe it. The idea is simple, you get a box that can not only pass packets around quickly, but runs software that can tweak them however you like too. The hardware is based on some open standards, so you are not locked in to parts from one specific vendor. The value is in the software, the hardware is generic. Intel loves this standard, Cisco officially loves it too, but the smiles they give are significantly less believable than a career politician’s.
What do you do with a SDN platform? Next generation firewalls, packet sniffers, intrusion detection, and pure evil like the packet twiddelers/throttlers that Comcast and other ISPs so love. Security is an ever changing world, so what you need one day may be completely insufficient the next. If you bake ‘security’ in to the hardware like some companies do, you are, umm, stupid. With SDN, you can change just about anything you want on the fly, and with luck, can adapt to whatever threatens you.
How does Netronome support twiddling packets at over 200Gbps? There are 120 processors that can each support 8 threads, basically 960 separate flows of L2-L7 tweaking on the fly, plus 96 more that only do L2-L3 oriented work. These 216 CPUs are ‘complex’ and can be programmed to do whatever you really want them to, within certain limits to keep latency down.
Far less generic are a claimed 100 more processors dedicated to specific tasks, be it bulk crypto, pattern matching, or TCP/IP acceleration. These are obviously far less adaptable, but also far faster at what they do. If Netronome picked them carefully, and they definitely have the knowledge base to do so, these should be more than enough to support the basic functions needed at very high rates of speed. Fixed function blocks for very common items is a must to achieve the levels of throughput this chip is capable of, latency and power targets would be next to impossible without them.
Netronome claims that they can do ‘security processing’ at over 50Gbps at wire speeds too. This means bulk crypto and well known algorithms can be decrypted, checked, re-encrypted if necessary, and then passed along. Since checking VPNs and SSL’d packets are one of the largest concerns for any modern corporation, this level of throughput should make most Next Generation Firewall vendors smile.
It is all fine and dandy to have all those units on board, but how do you feed them, and what do you do for memory access? The 6xxx line has over 30MB of cache on board, 20MB of which Netronome calls Proximity Memory. This is basically working memory strategically placed close to the units that need it, think of it as big caches for the hundreds of little units scattered all over the chip. Backing this up is six channels of DDR3, speeds up to 2133MHz are supported. Should you be interested in overclocking a NFP-6xxx part, Corsair has DDR3/3000MHz for sale, so it may be time to start a HWBot leaderboard for Netronome CPUs. NetFlowMark2012 anyone?
If all of this isn’t enough, the chips support four PCIe3 8x lanes, so you can put NFP-6xxx chips on a card and plug them in to an Intel server, or is it the other way around? Either way, you can add quite a bit of complex processing power, but that adds lots of latency too. This is probably why Netronome is targeting standalone operation, latency is king in the markets they target, and traversing a PCIe bus is an eternity for a 10GbE packet.
Given the lifespan of these chips, the initial levels of throughput offered are close to overkill for most uses, so for the time being, Netronome’s NFP-6xxx chips should be more than enough to do most SDN tasks. If not, you can always put more than one in, SDN work is, if nothing else, extremely threadable and flows are usually discrete.
In the end, Netronome is trying to make the ultimate network processing engine. They are using Intel’s 22nm process, effectively three generations ahead of the TSMC 65nm process used for their last part, to cram more processing power in to the same space as the competition, hopefully while using less energy. On paper, they have a powerful part that has the bandwidth and flexibility to fulfill the promises of SDN at high throughput rates. It will be interesting to see the devices that are made with NFP-6xxx chips, there is a lot of potential in the OpenFlow paradigm.S|A