
A very glad news for you
The 42nd fastest supercomputer on earth doesn’t exist.
This fall, Amazon built a virtual supercomputer atop its Elastic
Compute Cloud — a web service that spins up virtual servers whenever you
want them — and this nonexistent mega-machine
outraced all but 41 of the world’s real supercomputers.
Yes, beneath Amazon’s virtual supercomputer, there’s real hardware.
When all is said and done, it’s a cluster of machines, like any other
supercomputer. But that virtual layer means something. This isn’t a
supercomputer that Amazon uses for its own purposes. It’s a
supercomputer that can be used by anyone.
Amazon is the poster child for the age of cloud computing. Alongside
their massive e-tail business, Jeff Bezos and company have built a worldwide network of data centers
that gives anyone instant access to computing resources, including not
only virtual servers but virtual storage and all sorts of other services
that can be accessed from any machine on the net. This global
infrastructure is so large, it can run one of the fastest supercomputers
on earth — even as it’s running thousands upon thousands of other
virtual servers for the world’s businesses and developers.
This not only shows the breadth of Amazon’s service. It shows that in
the internet age, just about anyone can run a supercomputer-sized
application without actually building a supercomputer. “If you wanted to
spin up a ten or twenty thousand [processor] core cluster, you could do
it with a single mouse click,” says Jason Stowe, the CEO of
Cycle Computing,
an outfit that helps researchers and businesses run supercomputing
applications atop EC2. “Fluid dynamics simulations. Molecular dynamics
simulations. Financial analysis. Risk analysis. DNA sequencing. All of
those things can run exceptionally well atop the [Amazon EC2
infrastructure].”
And you could do it for a pittance — at least compared to the cost of
erecting your own supercomputer. This fall, Cycle Computing setup a
virtual supercomputer for an unnamed pharmaceutical giant that spans
30,000 processor cores, and it cost
$1,279 an hour.
Stowe — who has spent more than two decades in the supercomputing game,
working with supercomputers at Carnegie Mellon University and Cornell —
says there’s still a need for dedicated supercomputers you install in
your own data center, but things are changing.
“I’ve been doing this kind of stuff for awhile,” he says, “and I
think that five or 10 years from now, researchers won’t be worrying
about administering their own clusters. They’ll be spinning up the
infrastructure they need [from services like EC2] to answer the question
they have. The days of having your own internal cluster are numbered.”
To Cloud or Not to Cloud
The old guard does not agree. Last month, during a round table
discussion at the Four Seasons hotel in San Francisco, many of the
companies that help build the world’s supercomputers — including Cray
and Penguin Computing — insisted that cloud services can’t match what
you get from dedicated cluster when it comes to “high-performance
computing,” or HPC. “Cloud for HPC is still hype,” said Charlie
Wuischpard, the CEO of Penguin Computing. “You can do some wacky
experiments to show you could use HPC in that environment, but it’s
really not something you would use today.”
But it is being used today. And Amazon’s climb up the Top 500
supercomputer list shows that EC2 has the capacity to compete with at
least the supercomputers that are built with ordinary microprocessors
and other commodity hardware parts. “Rather than building your own
cluster,” says Jack Dongarra, the University of Tennessee professor who
oversees the annual list of the
Top 500 supercomputers, “Amazon is an option.”
Amazon’s virtual supercomputer wasn’t nearly as powerful as the
massive computing clusters sitting at the peak of the Top 500. It could
handle about 240 trillion calculations a second — aka 240 teraflops —
while the machine at the top of the list, Japan’s K Computer,
reaches 10 quadrillion calculations a second, or 10.51 petaflops. As
Dongarra points out, clusters like the K Computer use specialized
hardware you won’t find at Amazon or other supercomputers below, say,
the top 25 on earth. “The top 25 are rather specialized machines,”
Dongarra says. “They’re designed in some sense for a subset of very
specialized applications.”
But according to Dongarra, you could still run these specialized
applications atop Amazon. They just wouldn’t be quite as fast. And
though some researchers and business need are looking for petaflops,
others will do just fine with teraflops.
Clouds Meet PODs
The irony is that Charlie Wuischpard and Penguin Computing actually
offer their own online supercomputing service. They call it
Penguin-On-Demand. But this is a little different from Amazon EC2. In
essence, Penguin is offering remote access to a specific set of machines
running in one of its data centers, whereas Amazon offers access to a
virtual infrastructure that shared among everyone using the service.
“[POD] is not a virtualized resource,” Wuischpard tells us. “It’s
especially built for high-performance computing workloads. Amazon is now
trying to add this sort of thing to their toolkit, if you will, but I
still think we have a leg up on them.”
The distinction between the two is rather difficult to get at.
Ultimately, it comes down to two things: Penguin can tell you exactly
where your application is running, and it has a long history with
supercomputing. “There is a lot of difficulty in getting your
application to run in the cloud,” Wuischpard says. “There’s network
drivers and compilers and other stuff. You could figure out a lot of
that on your own, but part of our aim with POD is to provide of
expertise in building and running these machines to help our customers
get on board and start using it.” According to Chuck Moore, a corporate
fellow and technology group CTO at chip-designer Advanced Micro Devices,
application will require a significant rewrite if you’re moving them
from an old school supercomputer to a service like Amazon.
Some operations do prefer Penguin’s service to Amazon. Earthtime — a
company that offers 3-D maps of the world much like Google Street View
offers 2-D images — uses POD to generate these 3-D models, and company
founder and chief technology officer John Ristevski cites Penguin’s
support as a reason his company doesn’t use Amazon. “You need a certain
level of support, help with things like loading data off out disks and
tweaking the performance of the cluster to suit our needs,” he tells
Wired. “That’s not something we’ll ever get from Amazon. Amazon is never
going to manage the distribution of the jobs or the processing itself,
which is something that Penguin does.”
But with Amazon, a company like Cycle Computing can provide this sort
of help, and even Penguin CEO Charlie Wuischpard acknowledges that the
gap between Amazon and dedicated supercomputers is shrinking. Amazon
built its virtual supercomputer for the Top 500 list as a way of
announcing a
new type of virtual server
instance on EC2 that’s specifically designed for HPC applications. It’s
unclear how Amazon ran its benchmark tests for the Top 500 List — the
company did not respond to multiple requests for comment — but it looks
like they ran the tests on a new cluster of physical machines before
they were actually added to Amazon’s public service. Amazon previously
offered instances for HPC applications, but these new CC2 instances are
even beefier.
Spin Up, Spin Down
The point is that Amazon is an option. And it’s a rather convenient
option. For Jason Stowe, the CEO of Cycle Computing, the idea of
building 30,000-core supercomputer with no hardware that costs just
$1,279 an hour to run is something that can’t be ignored. “It’s just
absurd,” he says. “If you created a 30,000-core cluster in a data
center, that would cost you $5 million, $10 million, and you’d have to
pick a vendor, buy all the hardware, wait for it to come, rack it, stack
it, cable it, and actually get it working. You’d have to wait six
months, 12 months before you go it running.”
And by that time, he says, your application may have changed. “Your
question may have evolved since you first provisioned your
infrastructure,” Stowe says. “You may need more than 30,000 cores.” The
added twist is that after you spin up 30,000 machines on Amazon, you can
just as easily spin them down when you don’t need them.
Stowe agrees that Amazon isn’t for everyone. He acknowledges that
Amazon’s virtualization layer may put a real drag on certain
applications — a dedicated supercomputer runs without virtualization —
but he says there are far more applications that will run just fine on a
cloud service. And any drag will be much less than the six to 12 months
it would take to build a supercomputer — not to mention the expense.
“Your application may run 5 percent slower,” he says. “But you’re still
getting access to world-class compute power.”Please make review after reading this.