Sharing Infrastructure: Can Hadoop Play Well With Others?

Published: http://www.datanami.com
March 28, 2013
By: Isaac Lopez

A lot of big data/Hadoop implementations are swimming against the currents of what recent history has taught about large scale computing, and the result is a significant amount of waste.

So says Univa President and CEO Gary Tyreman, who claims that established HPC tools are more than capable of driving value where big data workloads are concerned.

“At the highest levels, I see today a lot of people doing proof of concept and moving a point solution into production – but that requires a stand-alone environment, a silo, and I think that kind of grinds against the popular methods of the last 10 years,” hypothesized Tyreman.

“If you think about VMware teaching us how to share through consolidation and virtualization, and the last 2-4 years that we’ve seen with cloud teaching us how to distribute the workload and pick-up and create mobility. So today, building a silo or a stand-alone environment to do anything just doesn’t really ring true, right?”

Most enterprises, says Tyreman, don’t have the 24/7 capacity needs for Hadoop that a social media company requires, and that while the Facebooks and Twitters of the world push the Hadoop envelope, many enterprises will want to roll out Hadoop to solve particular pain points.

This is where Univa’s Grid Engine resource management software platform adds value for enterprises wading into the Hadoop pool, says Tyreman, who says that his company is seeing new revenues (approx. 6% of their total in 2012, claims Tyreman) by enabling their customers to run Hadoop in a shared infrastructure model, which Tyreman claims ultimately saves them significant dollars in infrastructure and support costs.

Grid Engine, for those that don’t know, gained fame in its early days as a resource manager that was widely used in academic HPC installations. Univa acquired the tool in January 2011, and say they’ve made over 500 updates to the workload scheduler since taking over the management of it. They now claim that the tool is primarily used by enterprise & commercial users, with 80% of their users coming from that direction.

One area the Univa chief says that they are seeing notable uptick in shared-architecture Hadoop implementations is life sciences. One such customer is healthcare modeling and analytics organization, Archimedes Inc., who has developed a sophisticated predictive model for the human physiology.

Using the Archimedes Model, simulations can be run to predict health outcomes for patients or populations using an array of clinically relevant variables against a broad data set. The model is able to calculate effects of clinical interventions, including such factors as biological outcomes, health outcomes, utilization, quality of life, and financial costs. Using the tool, physicians can make adjustments, compare variable and use data-driven modeling to set priorities for patients.

This Archimedes Model runs on a dedicated cluster of approximately fifty multi-core, rack mounted servers using Univa’s Grid MP to manage the resources. The company says that as they began to productize their offering, they started experiencing painful bottlenecks that impacted the cost-effectiveness of their solution.

“We had done a lot of work to speed up the simulator, so the bottleneck was aggregating and loading the data that came out of the simulator into the analysis tool,” says Katrina Montinola, VP of Engineering with Archimedes. “When the data comes out of the simulator, you have to aggregate it in different ways. There are about 100 biomarkers, outcomes, and treatments per patient that need to be aggregated for each trial arm and each subpopulation. A simple load would take many, many hours, so we quickly realized we needed a better solution.”

That solution was Hadoop running in a shared architecture with their current environment, says Tyreman. “They wanted to be able to support multiple simulations within minutes and so going to Hadoop, they could cut that time down,” he explains.

“Hadoop allowed them to reduce the time, and what Univa brought to the party is that we enabled them to put it into the same cluster that they already had, which meant that they could now share so that they didn’t have to buy a new set of hardware. And so they save implementation costs of about 50%, and then, of course, whatever the operational cost on that is up front.”

Tyreman says that he expects to see this trend towards Hadoop in a shared environment not only continue, but expand. “The companies that are managing the workflow at the top, like Cisco Tidal Software, BMC Control-M, CA AutoSys – I think those tools are going to be used to drive and kick off the processes, and if you look at the way that those systems are built, they’re also typically siloed,” explains Tyreman. “I think that as people continue to adopt virtualization and move into private cloud infrastructures, the next logical step is to create sharing – so whether you’re coming from Hadoop up, or you’re coming from the workflow management tools down, shared infrastructures are the future.”

That’s exactly what Univa already has, says Tyreman, who is also boasting the release of the latest update to the Univa Grid Engine (8.1.4), which Tyreman says has almost 50 new feature updates, including memory usage metrics and performance enhancements relevant to those using a shared infrastructure model.

Waxing philosophically, Tyreman gives us a parting shot: “I guess the philosophical view is the products that have been deployed in HPC are perfectly suited for big data workloads. Anyone who says that they’re not, in my opinion, doesn’t understand what they’re saying, to be very blunt. The proof – we have customers in production with Hadoop running inside a shared infrastructure, using a product that was built on the back of some of the largest clusters in the world.”