Big Data – Essential
Big data is broadly defined as the capture, management, and analysis of data that goes beyond typical structured data, often to unstructured files, digital video, images, sensor data, log files, and in fact any data not contained in records with distinct searchable fields. In some sense, the unstructured data is the interesting data. Big data, by its nature, needs flexibility and scalability in order to process the tremendous volumes of unstructured and semi-structured data.
No wonder cloud computing services attract Big data customers.
“The Cloud”
Cloud computing, “the cloud,” is the delivery of on-demand computing resources—everything from applications to data centers—over the Internet on a pay-for-use basis.
The name cloud computing was inspired by the cloud symbol that is often used to represent the Internet in flowcharts and diagrams.
A cloud service has three distinct characteristics that differentiate it from traditional hosting.
It is sold on demand, typically by the minute or by the hour; it is elastic - a user can have as much or as little of a service as they want at any given time; and the service is fully managed by the provider ,the consumer needs nothing but a personal computer and Internet access.
·
Infrastructure-as-a-Service
(IaaS) - Storage, network, VM, load-balance , servers …
·
Platform-as-a-Service
(PaaS) - Database, backups, runtime,
Web-server, development tools, API …
·
Software-as-a-Service
(SaaS) - CRM, ERP, virtual desktop,
games ….
Today's
clouds are built on a mix of technologies, including virtualization, automation
and orchestration. In some circles, virtualization is nearly synonymous with
cloud. But above all a cloud is a pool of resources that is elastic, scalable
and accessible on-demand.
“The cloud” basic nature characteristics; sure looks promising for Big data customers.
Big Data & “The Cloud”
It is no surprise that the rise of Big Data has coincided with the rapid adoption of Platform-as-a-Service (PaaS) and Infrastructure-as-a-Service (IaaS) technologies. PaaS lets firms scale their capacity on demand and reduce costs, while IaaS allows almost instantaneously deployment of additional computing nodes.
The combination of PaaS and IaaS is what makes big data analytics achievable for firms of all kinds, regardless of their size or sector. These Cloud computing services puts Big data within the reach of companies that could never afford the high costs associated with buying sufficient hardware capacity to store and analyze such large data sets.
“Common Cloud Services” are not always sufficient for Big Data
Cloud
computing promises numerous benefits to Big data customers – among these are
agility, scalability, and reduced cost – but the virtualization layer inherent
in most public clouds has been somewhat of an anathema to the “HPC Big data” (High
Performance Computing). In other words, the “common cloud services” does not
offer the performance necessary to process real-time data without introducing
latency that would make the results in some cases un-useful. Performance
degradation can occur, stemming from the introduction of a hypervisor layer and
the multi-tenant nature of virtualized public cloud platforms.
Some of the
issues with virtualized cloud computing were detailed recently by
Internap Vice President of Hosted Services Gopala Tumuluri; Tumuluri
points out what most “HPC Big data” users already know, the virtualized,
multi-tenant platform common to most public clouds is subject to performance
degradation. "While the hypervisor enables the visibility, flexibility and
management capabilities required to run multiple virtual machines on a single
box, it also creates additional processing overhead that can significantly
affect performance," writes Tumuluri. Data-heavy loads are the most likely
to be negatively impacted, especially when the service is oversubscribed. Such
a setting is ripe for the so-called noisy-neighbor problem that occurs when too
many virtual machines compete for server resources.
Although
today's server virtualization is a lot sleeker than in years past, nothing can
beat the performance of bare metal. This is where the “bare metal cloud”
offers a significant advantage, especially for latency-sensitive workloads.
Bare Metal Cloud –
Basics
The
bare-metal cloud has emerged as a way to complement virtualized
services with a dedicated server environment that eliminates the
overhead of virtualization without sacrificing the flexibility, scalability and
efficiency benefits of the cloud. Bare-metal servers do not run a hypervisor,
are not virtualized, and can be delivered via a cloud-like service model. This
balances the scalability and automation of the virtualized cloud with the
performance capabilities found in monthly dedicated server hosting plans. The
hardware is fullydedicated instances,” which can still be part of a multi-tenant
environment; and “bare-metal servers,” or “dedicated servers,”
which could refer to a managed hosting service that involves fixed
architectures and longer-term contracts. A bare-metal cloud model enables
on-demand usage and metered hourly billing with physical hardware that was
previously only sold on a fully dedicated basis.
dedicated to the customer, including any additional storage
that may be required. Bare-metal instances can be provisioned and
decommissioned via a web-based portal or API as needed, providing access to
high-performance dedicated servers on demand. And, depending on the application
and use case, a single bare-metal server can often support larger workloads
than multiple, similarly sized VMs. It’s important not
to confuse true bare-metal cloud capabilities with other, related terminology,
such as “
Mixing
IaaS Options with Virtualized and Bare Metal Cloud
Bare-metal
and traditional, virtualized clouds are not competitors. They are simply
different branches of IaaS & PaaS technologies that allow customers to meet
a wide range of workload and application requirements. In fact, establishing a
mixed cloud environment is often an ideal approach. With this setup, companies
can choose, on an individual basis, how to best support each of their core
applications and services, thereby reducing capital costs, maximizing
operational efficiency and establishing a foundation for innovation through
adaptable hosting models.
The
definition of IaaS extends beyond the virtualized cloud and is still evolving
to meet different business needs. High-performance computing applications have
created a demand for dedicated hosting options with cloud-like features; and
the bare-metal cloud has emerged as a new way to meet these demands.
Bare Metal Cloud - Use Cases
High-performance,
bare-metal cloud functionality is ideal for operations where there is a need to
perform short-term, data-intensive functions without any kind of latency or
overhead delays. In the past, organizations couldn’t put these workloads into
the cloud, or they simply had to accept lower performance levels.
“HPC Big Data”
applications ideal candidates for bare-metal cloud. While running
big data in traditional public cloud environments has generated quite a bit of
buzz, the reality is that the disk I/O of virtualized cloud servers may not be
able to keep up with the high volume, high velocity data. Based on internal
benchmark assessments, Internap has
found that bare-metal cloud can provide up to five times the performance of a
similarly-sized virtualized public cloud (and up to 48% more cost-efficiency).
Although “HPC
Big data” is a natural candidate for such services, it is not the only case. Following
are other examples of use cases that are ideal for bare-metal cloud.
Media encoding
is used for one of today’s most popular types of websites – those with
user-generated content, such as social networking and video sharing sites. When
a user uploads a video, it must be transcoded into a common format that is
viewable by site visitors. The transcoding software for audio and video is
processor-intensive and if it’s located on the same machine as the web server
or used in a multi-tenant environment, this can impact performance. Bare-metal
cloud removes the performance lag and also delivers the flexibility to ramp up
on-demand during the transcoding periods and then immediately scale back down
during downtime, so there are no wasted resources.
Render farms,
many commercial 3D animation and CAD software applications support a “render
farm” mode, where a regular desktop workstation can be turned into a node in a
rendering cluster. This is often used by animation companies to develop media
assets during the day and to process their files after hours. With bare-metal
cloud, a designer could maintain a single, always-on “master” node to submit
rendering jobs. The master node would then interact with other hardware nodes
for processing the individual frames that need to be rendered. These hardware
nodes could be provisioned by design staff as needed to process large or small
jobs; or, the master node software could be adapted to provision additional
instances as needed through a provisioning API.
Compliance -
organizations that have rigorous compliance guidelines can benefit from the
bare-metal cloud. Companies in industries like finance, government and
healthcare must adhere to strict policies regarding how data is stored, managed
and shared. In these cases, shared infrastructure creates a level of
uncertainty, and a bare-metal cloud can provide the necessary control over data
location and keep it segregated within a well-defined, secure physical
environment.
Bare Metal Cloud - Services Providers
It sounds
like an oxymoron, but some cloud providers offer bare-metal servers -- and some
organizations consider them an appealing alternative to shared infrastructure.
Cloud
service providers that offer bare-metal servers include IBM's SoftLayer Technologies Inc., Rackspace
Hosting Inc. ,Internap Network Services Corp., and others.
While bare-metal
servers stem from a traditional managed hosting business for these
vendors, newer offerings have a single interface for managing cloud and
bare-metal assets, and they allow for more flexibility with bare-metal servers
than was traditionally available in hosting environments.
Users
of bare-metal
services say there's a performance advantage to dedicated
hardware resources. Because of this, relational databases are good candidates
for bare-metal servers.
"Cloud
is really best suited for things that you need to flexibly scale
horizontally," Dixit said. "It doesn't make sense to make [the
database] a cloud instance, because we know it's always going to be a part of
our stack, we know we're going to need our masters and our replication slaves
no matter what, and that's typically not something you scale by flipping a
switch."
Bare-metal
servers are also useful in big data and real-time analytics environments.
Hosted marketing software provider HubSpot Inc., based in Cambridge, Mass., has
more physical servers hosted in Rackspace's data center (about 160) than
virtual ones (about 60) to perform big data queries.
"In the
public cloud, it gets really expensive because you end up having to throw so
much capacity at the problem in order to get predictable performance,"
said HubSpot CIO Jim O'Neill. "In a dedicated environment, you give these
big data jobs full access to a fairly large server or a commodity large server,
and it's just made a world of difference."
Bare-metal
cloud niche: Performance and compliance
Information
streaming company Flow Search Corp. needed bare-metal performance to perform
real-time analytics on clickstream information, so it switched from an Amazon
Web Services cloud deployment to IBM SoftLayer's service late last year.
"You
can't get enterprise performance in a consumer cloud -- we're talking
milliseconds and in-memory processes," said Eric Alterman, CEO of the
Brooklyn, N.Y.-based company. "Other providers are going to have to
duplicate SoftLayer's bare-metal capabilities if they want to compete in the
enterprise."
Cost
benefits by using bare-metal servers.
Bare-metal
servers cost Dixit a flat $900 per month, which is often less than what
cloud-based servers accrue in usage-based charges.
Summary
When virtual
machines just don’t provide the performance you need, you need to get physical!
Bare-metal cloud runs without the overhead of a hypervisor and eliminates
resource contention inherent to multi-tenant environments. Moreover, bare metal
can provide the extra processing power and highly-consistent disk and network
I/O; ultimately providing the platform you need for your most demanding
applications. Bare Metal Cloud service gives customers the opportunity to enjoy the
flexibility, provisioning, and on-demand billing advantages of cloud computing. It’s scalable and pay as you go—all the power
and advantages of virtualized hardware without the overhead of a hypervisor.
References
- http://www.hpcwire.com/hpcwire/2013-08-27/the_benefits_of_bare-metal_clouds.html
- http://searchcloudcomputing.techtarget.com/news/2240203392/Bare-metal-servers-in-the-cloud-aid-performance-compliance
- http://www.cloudtweaks.com/2013/08/bare-metal-cloud-meeting-the-demand-for-high-performance-cloud-solutions/
- http://www.logicworks.net/blog/2013/06/big-data-in-thee-non-virtualized-cloud-myth-or-reality/
- http://www.techrepublic.com/blog/the-enterprise-cloud/how-the-cloud-fits-into-the-big-data-technology-stack/