03
Aug
By Mayank Bawa in Blogroll, TCO on August 3, 2009
   

Netezza pre-announced last week that they will be moving to a new architecture - one based around IBM blades (Linux + Intel + RAM) with commodity SAS disks, RAID controllers, and NICs. The product will continue to rely on an FPGA, but that would sit much further from the disks & RAID controller, beyond the RAM but adjacent to the Intel CPU, in contrast to their previous product line.

In assembling a new hardware stack, Netezza calls this re-architecture as a change but not really a change - the FPGA will continue to offload data compression/decompression, selection and projection from the Intel CPU; the Intel CPU will be used to push-down joins and group bys; the RAM will be used to enable caching (thus helping improve mixed workload performance).

I think this is a pretty significant change for Netezza.

Clearly, Netezza would not have invested in this change - assemble & ship a new hardware stack to share revenue with IBM vs. a 3rd party hardware assembler - if Netezza’s old FPGA-dominant hardware was not being out-priced and out-performed by our Intel-based commodity hardware.

It was a matter of time before the market realized that FPGA’s had reached their end-of-life status in the data warehousing market. In realizing the writing on the wall, and responding to it early, Netezza has made a bold decision to change - and yet, clung to the warm familiarity of an FPGA as a “side car”.

Netezza, and the rest of the market, will soon become aware that a change in hardware stack is not a free lunch. The richness of CPU and RAM resources in an IBM commodity blade come at a cost that a resource-starved FPGA-based architecture never had to account for.

In 2009, after having engineered its software for an FPGA over the last 9 years, Netezza will need to come to terms with commodity hardware in production systems and demonstrate that they can:

- Manage processes and memory spawned by a single query across 100s of blade servers

- Maintain consistent caches across 100s of blade servers - after all, it is Oracle’s Cache Fusion technology that is the bane of scaling Oracle RAC beyond 8 blade servers

- Tolerate the higher frequency of failures that a commodity Linux + RAID Controller/driver + Network driver stack incur when put under rigorous data movement (e.g., allocation/de-allocation of memory contributing to memory leaks)

- Add a new IBM blade and ensure incremental scaling of their appliance

- Upgrade the software stack in place - unlike an FPGA-based hardware stack that customers are OK to floor-sweep in their upgrade

- Contain run-away queries from allocating the abundant CPU and RAM resources and starving other concurrent queries in the workload

- Reduce network traffic for a blade with 2 NICs that is managing 8 disks vs. a Power-PC/FPGA that had 1 NIC for 1 disk

-

If you take a quick pulse of the market, apart from our known installations of 100+ servers, there is no other vendor - mature or new-age - who has demonstrated that 100′s of commodity servers can be made to work together to run a single database.

And I believe that there is a fundamental reason for this lack of proof-point even a decade after Linux has matured and commodity servers have been used for computing - software not built from the ground-up to leverage the richness and contain the limitations of commodity hardware is incapable of scaling. Aster nCluster has been built ground up to have these capabilities on a commodity stack. Netezza’s software written for proprietary hardware cannot be retrofitted to work on commodity hardware (else, Netezza would have completely taken the FPGAs out, now that they have powerful CPUs!). Netezza has its work cut-out - they have taken a dramatic shift that has the ability to bring the company and its production customers to its knees. And there-in lies Netezza’s challenge - they must succeed while supporting their current customers on an FPGA-based platform while moving resources to build out a commodity-based platform.

And we have not even touched upon the extension of SQL with MapReduce to power big data manipulation using arbitrary user-written procedures.

If a system is not fundamentally designed to leverage commodity servers, it’s only going to be a band-aid on seams that are bursting. Overall, we will curiously watch how long it takes Netezza to eliminate their FPGAs completely and move to a real commodity stack so that the customers can have the freedom to choose their own hardware and not be locked down to Netezza-supplied custom hardware.


Comments:
Joe Harris on August 4th, 2009 at 5:36 am #

First let me say that I’m a fan of Aster’s approach to scalability and fault tolerance.

However, I think it’s a stretch to say that “Netezza would not have invested in this change… if Netezza’s old FPGA-dominant hardware was not being out-priced and out-performed by our Intel-based commodity hardware..”

They have not removed the FPGAs from their architecture (They added more FPGAs) and it’s difficult to read much into their new position in the pipeline without knowing NZ’s software architecture in detail, which no one does.

It seems more likely that they needed to move away from PowerPC CPUs (which are clearly dying) and took the opportunity to simplify their hardware design for future upgrades.

Speaking to Phil Fransisco last spring (re compression) he said that he felt that Netezza still had considerable runway to improve their performance and this announcement bears that out. They have not indicated any additional moves to a pure software / commodity hardware approach.

I would suggest that Aster and Netezza are selling to somewhat different groups and in Netezza’a case that group wants 1) the absolute best possible query performance 2) for a large but not infinite amount of data and 3) don’t give a stuff about commodity hardware.

Just my opinion of course.

Mayank Bawa on August 4th, 2009 at 9:46 am #

Joe - I appreciate the discussion and your comments.

I’ve not met Phil Francisco, but the performance improvement in this announcement has come about via a hardware re-architecture. This is not simply an upgrade of PowerPC with Intel - it is a change in the hardware dataflow pipeline, and this change will imply changes in the software architecture.

We (i.e., the market) have visibility into the Netezza hardware stack via their blog posts and white papers.

In the old architecture, the data flow used to be: 1 Disk -> FPGA -> PowerPC -> 1GB RAM.

In the new architecture, the data flow will be: 8 Disks -> RAID Controller -> 16 GB RAM -> FPGA -> Intel CPU. The Intel CPU would have received data at the same linespeed as the FPGA, but we have FPGA offloading processing from the Intel CPU.

Notice that the old architecture was explained by saying that higher performance is obtained by bringing processing right next to the disk (i.e., Disk -> FPGA). Netezza internal benchmarks claim higher performance for this new architecture (i.e., Disk -> Controller -> RAM caching -> FPGA) - clearly custom processing right next to disk was not adding any more value.

Netezza and Aster are providing a solution to the same groups and for the same wants of (1) and (2).

The use of commodity hardware is a means to an end. The end user may-or-may-not care about commodity hardware - but s/he certainly cares about its implications (lower cost, incremental scaling, replacement of parts without replacing the whole, faster year-on-year performance improvements, …) and limitations (higher failure rates, higher error margins, inefficient network & disk accesses).

Post a comment

Name: 
Email: 
URL: 
Comments: