Thursday, October 17, 2013

Big Data cloud – should I use Bare Metal ?


Big Data – Essential

Big data is broadly defined as the capture, management, and analysis of data that goes beyond typical structured data, often to unstructured files, digital video, images, sensor data, log files, and in fact any data not contained in records with distinct searchable fields. In some sense, the unstructured data is the interesting data. Big data, by its nature, needs flexibility and scalability in order to process the tremendous volumes of unstructured and semi-structured data.

No wonder cloud computing services attract Big data customers.


“The Cloud”

Cloud computing, “the cloud,” is the delivery of on-demand computing resources—everything from applications to data centers—over the Internet on a pay-for-use basis.
The name cloud computing was inspired by the cloud symbol that is often used to represent the Internet in flowcharts and diagrams.
A cloud service has three distinct characteristics that differentiate it from traditional hosting.
It is sold on demand, typically by the minute or by the hour; it is elastic - a user can have as much or as little of a service as they want at any given time; and the service is fully managed by the provider ,the consumer needs nothing but a personal computer and Internet access.
Cloud computing services are broadly divided into three categories:
·         Infrastructure-as-a-Service (IaaS) - Storage, network, VM, load-balance , servers …
·         Platform-as-a-Service (PaaS) -  Database, backups, runtime, Web-server, development tools, API  …
·         Software-as-a-Service (SaaS) -  CRM, ERP, virtual desktop, games ….
Today's clouds are built on a mix of technologies, including virtualization, automation and orchestration. In some circles, virtualization is nearly synonymous with cloud. But above all a cloud is a pool of resources that is elastic, scalable and accessible on-demand.

“The cloud” basic nature characteristics; sure looks promising for Big data customers. 


Big Data & “The Cloud”

It is no surprise that the rise of Big Data has coincided with the rapid adoption of Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) technologies. PaaS lets firms scale their capacity on demand and reduce costs, while IaaS allows almost instantaneously deployment of additional computing nodes.
The combination of PaaS and IaaS is what makes big data analytics achievable for firms of all kinds, regardless of their size or sector. These Cloud computing services puts Big data within the reach of companies that could never afford the high costs associated with buying sufficient hardware capacity to store and analyze such large data sets.



“Common Cloud Services”  are not always sufficient for Big Data

Cloud computing promises numerous benefits to Big data customers – among these are agility, scalability, and reduced cost – but the virtualization layer inherent in most public clouds has been somewhat of an anathema to the “HPC Big data” (High Performance Computing). In other words, the “common cloud services” does not offer the performance necessary to process real-time data without introducing latency that would make the results in some cases un-useful. Performance degradation can occur, stemming from the introduction of a hypervisor layer and the multi-tenant nature of virtualized public cloud platforms.

Some of the issues with virtualized cloud computing were detailed recently by Internap Vice President of Hosted Services Gopala Tumuluri; Tumuluri points out what most “HPC Big data” users already know, the virtualized, multi-tenant platform common to most public clouds is subject to performance degradation. "While the hypervisor enables the visibility, flexibility and management capabilities required to run multiple virtual machines on a single box, it also creates additional processing overhead that can significantly affect performance," writes Tumuluri. Data-heavy loads are the most likely to be negatively impacted, especially when the service is oversubscribed. Such a setting is ripe for the so-called noisy-neighbor problem that occurs when too many virtual machines compete for server resources.

Although today's server virtualization is a lot sleeker than in years past, nothing can beat the performance of bare metal. This is where the “bare metal cloud” offers a significant advantage, especially for latency-sensitive workloads.


Bare Metal Cloud – Basics

The bare-metal cloud has emerged as a way to complement virtualized services with a dedicated server environment that eliminates the overhead of virtualization without sacrificing the flexibility, scalability and efficiency benefits of the cloud. Bare-metal servers do not run a hypervisor, are not virtualized, and can be delivered via a cloud-like service model. This balances the scalability and automation of the virtualized cloud with the performance capabilities found in monthly dedicated server hosting plans. The hardware is fullydedicated instances,” which can still be part of a multi-tenant environment; and “bare-metal servers,” or “dedicated servers,” which could refer to a managed hosting service that involves fixed architectures and longer-term contracts. A bare-metal cloud model enables on-demand usage and metered hourly billing with physical hardware that was previously only sold on a fully dedicated basis.
dedicated to the customer, including any additional storage that may be required. Bare-metal instances can be provisioned and decommissioned via a web-based portal or API as needed, providing access to high-performance dedicated servers on demand. And, depending on the application and use case, a single bare-metal server can often support larger workloads than multiple, similarly sized VMs. It’s important not to confuse true bare-metal cloud capabilities with other, related terminology, such as “

Mixing IaaS Options with Virtualized and Bare Metal Cloud
Bare-metal and traditional, virtualized clouds are not competitors. They are simply different branches of IaaS & PaaS technologies that allow customers to meet a wide range of workload and application requirements. In fact, establishing a mixed cloud environment is often an ideal approach. With this setup, companies can choose, on an individual basis, how to best support each of their core applications and services, thereby reducing capital costs, maximizing operational efficiency and establishing a foundation for innovation through adaptable hosting models.
The definition of IaaS extends beyond the virtualized cloud and is still evolving to meet different business needs. High-performance computing applications have created a demand for dedicated hosting options with cloud-like features; and the bare-metal cloud has emerged as a new way to meet these demands.


Bare Metal Cloud - Use Cases

High-performance, bare-metal cloud functionality is ideal for operations where there is a need to perform short-term, data-intensive functions without any kind of latency or overhead delays. In the past, organizations couldn’t put these workloads into the cloud, or they simply had to accept lower performance levels.
“HPC Big Data” applications  ideal candidates for bare-metal cloud. While running big data in traditional public cloud environments has generated quite a bit of buzz, the reality is that the disk I/O of virtualized cloud servers may not be able to keep up with the high volume, high velocity data. Based on internal benchmark assessments, Internap  has found that bare-metal cloud can provide up to five times the performance of a similarly-sized virtualized public cloud (and up to 48% more cost-efficiency).
Although “HPC Big data” is a natural candidate for such services, it is not the only case. Following are other examples of use cases that are ideal for bare-metal cloud.

Media encoding is used for one of today’s most popular types of websites – those with user-generated content, such as social networking and video sharing sites. When a user uploads a video, it must be transcoded into a common format that is viewable by site visitors. The transcoding software for audio and video is processor-intensive and if it’s located on the same machine as the web server or used in a multi-tenant environment, this can impact performance. Bare-metal cloud removes the performance lag and also delivers the flexibility to ramp up on-demand during the transcoding periods and then immediately scale back down during downtime, so there are no wasted resources.
Render farms, many commercial 3D animation and CAD software applications support a “render farm” mode, where a regular desktop workstation can be turned into a node in a rendering cluster. This is often used by animation companies to develop media assets during the day and to process their files after hours. With bare-metal cloud, a designer could maintain a single, always-on “master” node to submit rendering jobs. The master node would then interact with other hardware nodes for processing the individual frames that need to be rendered. These hardware nodes could be provisioned by design staff as needed to process large or small jobs; or, the master node software could be adapted to provision additional instances as needed through a provisioning API.
Compliance - organizations that have rigorous compliance guidelines can benefit from the bare-metal cloud. Companies in industries like finance, government and healthcare must adhere to strict policies regarding how data is stored, managed and shared. In these cases, shared infrastructure creates a level of uncertainty, and a bare-metal cloud can provide the necessary control over data location and keep it segregated within a well-defined, secure physical environment.


Bare Metal Cloud - Services Providers

It sounds like an oxymoron, but some cloud providers offer bare-metal servers -- and some organizations consider them an appealing alternative to shared infrastructure.
Cloud service providers that offer bare-metal servers include IBM's  SoftLayer Technologies Inc., Rackspace Hosting Inc. ,Internap Network Services Corp., and others.  
While bare-metal servers stem from a traditional managed hosting business for these vendors, newer offerings have a single interface for managing cloud and bare-metal assets, and they allow for more flexibility with bare-metal servers than was traditionally available in hosting environments.
Users of bare-metal services say there's a performance advantage to dedicated hardware resources. Because of this, relational databases are good candidates for bare-metal servers.

"Cloud is really best suited for things that you need to flexibly scale horizontally," Dixit said. "It doesn't make sense to make [the database] a cloud instance, because we know it's always going to be a part of our stack, we know we're going to need our masters and our replication slaves no matter what, and that's typically not something you scale by flipping a switch."
Bare-metal servers are also useful in big data and real-time analytics environments. Hosted marketing software provider HubSpot Inc., based in Cambridge, Mass., has more physical servers hosted in Rackspace's data center (about 160) than virtual ones (about 60) to perform big data queries.
"In the public cloud, it gets really expensive because you end up having to throw so much capacity at the problem in order to get predictable performance," said HubSpot CIO Jim O'Neill. "In a dedicated environment, you give these big data jobs full access to a fairly large server or a commodity large server, and it's just made a world of difference."
Bare-metal cloud niche: Performance and compliance
Information streaming company Flow Search Corp. needed bare-metal performance to perform real-time analytics on clickstream information, so it switched from an Amazon Web Services cloud deployment to IBM SoftLayer's service late last year.
"You can't get enterprise performance in a consumer cloud -- we're talking milliseconds and in-memory processes," said Eric Alterman, CEO of the Brooklyn, N.Y.-based company. "Other providers are going to have to duplicate SoftLayer's bare-metal capabilities if they want to compete in the enterprise."
Cost benefits by using bare-metal servers.
Bare-metal servers cost Dixit a flat $900 per month, which is often less than what cloud-based servers accrue in usage-based charges.


Summary

When virtual machines just don’t provide the performance you need, you need to get physical! Bare-metal cloud runs without the overhead of a hypervisor and eliminates resource contention inherent to multi-tenant environments. Moreover, bare metal can provide the extra processing power and highly-consistent disk and network I/O; ultimately providing the platform you need for your most demanding applications. Bare Metal Cloud service  gives customers the opportunity to enjoy the flexibility, provisioning, and on-demand billing advantages of cloud computing.  It’s scalable and pay as you go—all the power and advantages of virtualized hardware without the overhead of a hypervisor.


References