Amazon announced today the availability of special EC2 cloud clusters that are optimized for low-latency network operations. This is useful for applications in the so-called High-Performance Computing area, where servers need to request and exchange data very fast. Examples of HPC applications range from nuclear simulations in government labs to playing chess.
I find this development interesting, not only because it makes scientific applications in the cloud a possibility, but also because it’s an indication of where cloud infrastructure is heading.
In the early days, Amazon EC2 was very simple: if you wanted 5 “instances” (that is, 5 virtual machines), that’s what you got. However, memory of the instances was low, as well as disk capacity. Over time, more and more configurations were added and now one can choose an instance type from a variety of disk & memory characteristics with up to 15GB of memory and 2TBs of disks per instance. However, network was always a problem independently of the size of the instance. (According to rumors, EC2 would make things worse by distributing instances as far away from each other as possible in the datacenter to increase reliability – as a result, network latency would suffer.) Now, the network problem is being solved by means of these special “Cluster Compute Instances” that provide guaranteed, non-blocking access to a 10GbE network infrastructure.
Overall this course represents a departure from the super-simple black-box model that EC2 started from. Amazon – wisely – realizes that accommodating more applications requires transparency – and providing guarantees – for the underlying infrastructure. Guaranteeing network latency is just the beginning: Amazon has the opportunity add much more options and guarantees around I/O performance, quality of service, SSDs versus hard drives, fail-over behavior etc. The more options & guarantees Amazon offers the closer we’ll get to the promise of the cloud – at least for resource-intensive IT applications.