Software Defined Storage with HPE Apollo and Ceph or Gluster

Written by Jesse Barker | Mar 14, 2019 12:30:00 PM

Business needs continue to drive unrelenting growth in the data storage that IT organizations need to provision and manage. Rather than evolve existing storage solutions to keep pace with this growth, datacenter management often find themselves needing to perform a “forklift upgrade”, “lift and shift” or “rip-and-replace” - that is, completely remove old technology and replace it with a new generation of data storage. Whatever you call it, the task is a periodic nightmare that has become part of our datacenter reality.

For many organizations, rip-and-replace data storage projects have been needed every three years for some time, as new generations of storage area network (SAN) technologies have been adopted to keep up with growing storage needs. This cycle is disruptive and expensive, leading many organizations to look for an alternative type of storage solution that can evolve incrementally to keep up with growth.

To meet this need, a new approach is gaining traction in organizations that need petabyte scale solutions. Termed Software Defined Storage (SDS), these data storage solutions are based on commodity hardware and open source software, In this post, we look at how HPE Apollo systems, together with software like Ceph or Gluster, can provide scalable and cost-effective storage that eliminates the rip-and-replace technology cycle.

Traditional Data Storage is Expensive and Inflexible

Traditional data storage technology, like a Storage Area Networks (SAN), leverages special-purpose compute, networking, and storage hardware to optimize performance and keep up with ever growing volumes of data. In addition, these solutions typically require proprietary software for management and administration. As a result, IT organizations face two ongoing challenges: technology obsolescence drives frequent replacements and increasing complexity causes a crippling lack of flexibility.

Three Year Replacement Cycles

Because data keeps growing, many organizations have been forced into an expensive cycle of ripping and replacing their storage solution every three years. This happens because traditional footprints are monolithic and difficult to expand incrementally. Organizations try to size solutions correctly, but always face the danger of over or undersizing. As a result, many companies find that they run out of space within 3 years and need to adopt new technology to get to the next level. They may need to upgrade switches and arrays to maintain performance at a larger scale. They may even need to switch vendors to take advantage of new technology. Today, there are lots of companies that installed solutions designed for 100s of terabytes. But as more and more organizations have petabyte scale needs, many are facing another replacement decision.

Not only is replacement expensive, but it is very time consuming and labor intensive. For example, LUN mappings and spindle layout need revalidation. Health checks need to be performed to ensure continuity of access to data and sufficient IOPS are available. But at petabyte scales, these health checks become too complicated and time consuming for all but the largest IT organizations.

Lack of Flexibility

Complexity and specialized technology are the reasons that SANs become inflexible at petabyte scale. SANs are complex beasts. To mimic the performance of locally attached storage, SANs are built on Fibre Channel technology (optical fiber) to maximize data transfer rates. The SAN is typically a dedicated network for data storage traffic, not accessible from the LAN. So, the SAN is designed, implemented, and administered independently of the LAN to meet the specific needs of data storage.

SANs have their own gear, including SAN switches (for optical fiber traffic) and SAN servers that provide access to the SAN and usually contain host bus adapters (HBAs) that interface between the server operating system (electrical) and the network (optical).

Within the SAN, logical unit numbers (LUNs) are used to identify and access data on the storage devices. Every device, or partition on a device, is assigned a LUN. LUNs are the primary tool for directing traffic and accessing data on the SAN. LUNs can be used to implement access control policies. In this way, SAN management is all about LUN administration.

One reason that this architecture is brittle is because the computing resources are often virtualized, while the storage is not. While it is easy to move a VM from one node to another, the process of remapping the storage (LUNs) to the new location is not as automated. In fact, it can be a significant administrative task. As you scale to thousands of users and petabytes of data, the LUN administration becomes impossibly complex and presents a major bottleneck.

Another reason for the lack of flexibility is that, in this architecture, you cannot simply add new technology on top of the existing infrastructure when you want to expand and modernize since newer gear is often incompatible with existing hardware and software. Newer SAN switches may not work with your existing management software, or new disk arrays may not be fully supported by your existing switches. For example, if you want to introduce new features like NVRAM when you create new pools of data, you are often prevented from doing so by the monolithic nature of SAN implementations.

To address these problems, we would like to implement a virtualized pool of storage that is available to the VMs regardless of their physical host and without the complexities of LUN administration. Ideally, we’d like the storage pool to be self-healing with high availability failover. Perhaps most importantly, for scalability, we’d like to be able to grow by simply adding new nodes to our infrastructure - not having to face the “rip and replace” dilemma every three years.

To achieve these goals, many organizations are moving to a new architecture for storage built on industry-standard hardware and open source software. At IIS, we’ve found that the HPE Apollo platform is particularly well suited for such software-defined data storage solutions.

SDS Solutions Built on HPE Apollo Systems

HPE Apollo systems are designed to to service large data workloads in a software-defined storage (SDS) environment. They can support petabyte scale data requirements for enterprise datacenter deployments.

Together with open source software like Ceph or Gluster, HPE Apollo hardware can provide an SDS solution with significant benefits:

Scaling from a few hundred terabytes to petabytes - far beyond traditional storage solutions.
Lower up-front investment because you can start small and add capacity as needed.
Lower total costs of ownership (TCO) because the solution is built on industry-standard hardware and open source software.
Single SDS cluster that can support object, file, and block storage.

In a data storage solution based on HPE Apollo hardware, data is stored on the server-based local disks, not in specialized storage appliances like those used in a SAN. As a result, scaling up your storage is as simple as adding more servers to the cluster.

A software-defined storage solution, powered by Ceph or Gluster, manages the distributed data storage and provides protocol support for connectivity with virtual machines, applications, and personal computers. It is possible to provide clients with block device (e.g., iSCSI), file sharing (e.g., NFS), and object storage (e.g., S3) support.

Software Defined Storage with HPE Apollo Servers

Once you have decided to move to industry-standard hardware for your storage needs, the next step is to determine whether Ceph or Gluster is the best software to meet your business application needs.

Software Defined Storage with Ceph and Gluster

CEPH and Gluster are most widely adopted open source storage solutions that support SDS clusters. Both are backed by Red Hat. Because of this widespread adoption and industry support, IIS recommends choosing one of these two products. Which one you select will be based on your business needs and your IT organization’s preferences.

Technologically, there is a fundamental difference between these two solutions. Ceph is built on object storage technology. On top of the object store, Ceph provides block and file access protocols. Gluster is the opposite. It is built on file storage technology (e.g, XFS). Gluster is primarily used to provide file sharing access to clients via NFS and SMB.

Let’s take a closer look at some of the technology differences between Ceph and Gluster.

Ceph

A Ceph Storage Cluster is built on the Reliable Autonomic Distributed Object Store (RADOS) - a software layer that manages object storage across the cluster and provides block, object, and file sharing services to client applications. For example, operating systems - and Kernel-based Virtual Machines (KVMs) - can mount software-defined block devices from the storage cluster using iSCSI via the RADOS Block Device (RBD) interface. Likewise, applications can interact with Ceph’s object store using either Amazon S3 or OpenStack Swift RESTful APIs. File sharing is also supported by the Ceph file system (CephFS), a POSIX-compatible file system interface supported by RADOS.

Ceph Architecture for Distributed Data Storage

The top half of this diagram shows how RADOS supports block, object, and file storage to clients. The bottom half illustrates how Ceph provides distributed storage on a cluster of HPE Apollo servers. Object Storage Device (OSD) daemons run on each HPE Apollo server - usually one OSD instance per physical hard disk drive - and provide data storage, replication, recovery, and rebalancing services. The OSDs also communicate with the monitor (MON) daemons that run on other servers, separate from the physical disk storage. The MONs maintain all the metadata necessary to manage the distributed data store.

Gluster

Red Hat Gluster Storage is built on GlusterFS, an open source network file system for distributed data storage. GlusterFS aggregates various storage services over a network into one large parallel network file system. GlusterFS is POSIX compatible and most commonly uses the XFS file system to store data on disks. Clients can access data stored in Gluster using industry-standard protocols such as Network File System (NFS) and Server Message Block (SMB / CIFS).

Unlike most distributed storage solutions, Gluster does not need metadata servers. The no-metadata architecture uses an elastic hashing algorithm to locate files algorithmically.

Gluster Architecture for Distributed File Storage

As shown in the diagram, Red Hat Gluster Storage is built on the glusterFS network file system which, in turn, works with the native file system (e.g., XFS) on each HPE Apollo server to present a single, software-defined storage pool. Gluster presents logical volumes for file sharing over industry-standard protocols (e.g., NFS, SMB). Within the cluster, these logical volumes may be composed of individual directories stored on various servers (these directories are referred to as “bricks”).

Which is Right for My Organization - Ceph or Gluster?

If your organization is familiar and comfortable managing file sharing technology, then Gluster will be simple to adopt. Ceph administration requires an understanding of object storage technologies, and may present a bit of a learning curve for some organizations.

Gluster has geo-replication features that provide excellent support for disaster recovery. Gluster is also favored in situations where the business needs to stream large media files with high throughput from RAID drives.

Ceph, on the other hand, may be preferred for organizations that need to support object and block storage, in addition to file storage. Gluster is not recommended, in production environments, as an object store or block storage solution - although these capabilities are available and continue to improve.

Both Ceph and Gluster can work well for file sharing. The CephFS file system provides proven scalability and speed for file sharing. In addition, because Ceph is object-based, and maintains centralized metadata, it may provide lower latency as you scale to petabytes. Gluster does not maintain centralized metadata which simplifies administration, but may cause higher latency at scale. Organizations needing low-latency storage that can quickly scale up or down may be a better fit for Ceph. On the other hand, companies that need to store large amounts of data without too much movement (i.e., deep storage) should probably consider Gluster.

Advantages of Software Defined Storage with HPE Apollo Systems

Whether you chose Ceph or Gluster for software-defined storage, there are many advantages of building you storage solution on an industry-standard hardware platform like HPE Apollo.

Avoiding Rip and Replace

Unlike traditional, monolithic storage systems like SANs, with SDS solutions you will no longer be forced to rip and replace every three years to keep up with each technology cycle. Instead, you can scale out by simply adding new nodes to the existing system. Your new nodes can incorporate new technology and work within the same cluster as older nodes. For example, you could add a new pool of storage with NVRAM within the same cluster to support specific application needs. Likewise, older nodes can be retired gradually as they are replaced with newer technology.

A software-defined storage solution also can provide reliability benefits. It can be self-healing, so that hardware failures are handled through automatic failover. You also get predictable performance as you scale. These types of systems also provide a cost-effective alternative to public cloud storage.

Public Cloud Alternative

Public cloud data storage can be expensive. Companies often get lured in my inexpensive rates to store data, only to find that when the need to access the data, bandwidth usage and other charges rapidly escalate. For example, every time you organization need to run a report, or open a spreadsheet, you incur bandwidth and access charges from your cloud provider. In addition to the expense, regulatory or compliance requirements may preclude storing data with a public cloud provider.

An SDS solution built on HPE Apollo servers provides an economic alternative to the public cloud. If your data is not simply in “deep storage”, you will probably find that an on-premise SDS solution is less expensive.

What Applications Benefit from SDS Solutions?

A wide variety of business applications can benefit from SDS solutions. Applications that are a good fit usually combine large volumes of data, a need for economic storage, and a requirement for easy and inexpensive access to the data. Three specific applications that meet these criteria are streaming video, archival storage, and Internet of Things (IoT).

Streaming Video

The amount of video content that businesses need to manage has exploded in recent years and the growth shows no signs of abating. Media distribution companies, for example, have enormous, growing, content libraries. Public cloud storage is not necessarily a great option because on-demand consumption requires low latency access, even for content that is rarely viewed. An SDS solution hosted by HPE Apollo systems is a great fit for this kind of streaming.

But, it is not only media companies that need to manage streaming video these days. Any businesses that needs to manage surveillance videos can find themselves with a huge inventory of content that needs to be streamable on-demand. For example, prisons, critical infrastructure (e.g., power generation), and certain government facilities must manage video from hundreds of cameras that are recording 24x7. All that video needs to be accessible to support investigations, law enforcement, compliance, and a number of other use cases.

Archival Storage

Many businesses have growing need for enormous amounts of archival storage. In many cases, archives are required by regulatory authorities. In other situations, internal compliance teams require archives to support audit teams. As business data volumes have exploded, the archives have grown exponentially as well. This vast amount of data may never be accessed, but it cannot be deleted because the organization never knows when an auditor or regulator will need to look at some part of it.

In this case, an SDS solution provides economical storage that can be easily scaled as needed. When you need more archival storage, you can simply plug in another node.

Internet of Things (IoT)

IoT applications generate enormous amounts of data. Consider that a factory with dozens of machines may have hundreds of IoT monitors each generating terabytes of data over the course of a year. Storage needs for such applications can grow into the petabytes within a year or two. The data needs to be available for access by analytics and machine learning applications.

SDS provides a great solutions for these applications because of the low total cost of ownership and the flexibility to provide object storage capabilities that applications need.

If your business is interested in learning more about how SDS could solve some some of your data storage problems, IIS can help.

IIS - Your Partner for Software Defined Storage Solutions

International Integrated Solutions (IIS) is a managed service provider and system integrator with deep expertise in data storage solutions and HPE Apollo systems. IIS is a distinguished HPE partner, winning HPE Global Partner of the Year in 2016 and Arrow’s North American Reseller Partner of the Year in 2017.

Managing the transition from traditional storage to SDS can be complex, but the payoff is substantial and worth the time that your organization may need to invest in new skills. IIS can guide you through the process, providing systems integration and managed services. In particular, IIS can help with:

Solution Specification - selecting the hardware and software that best supports your applications.
Migration - helping you move data and applications from traditional storage to the new SDS solution.
Ongoing Support - providing a first line of support for your SDS and interfacing with HPE and Red Hat as needed.
Monitoring - tracking performance, running health checks, and providing ongoing maintenance.

View full post