Artificial Intelligence (AI) technology has gone mainstream in a variety of business applications from the voice recognition found in automated customer phone support to fraud detection systems that screen credit card transactions in real time. In the datacenter, AI technology can predict systems failures before they happen, enabling preventative maintenance to be much more effective. For many organizations, a single hour of downtime costs over $100,000 - and at larger companies, the costs range into the millions of dollars. To help datacenters benefit from AI by preventing downtime, Hewlett Packard Enterprise (HPE) has been rolling out HPE InfoSight across their product line.
What is HPE InfoSight?
HPE InfoSight is a cloud-based intelligent analytics engine that receives data from sensors embedded in HPE hardware and software, continuously learning from that data to refine a predictive model that can detect the conditions leading up to a systems failure and warn organizations when they are in danger and should apply preventative maintenance.
According to HPE, “Every second, HPE InfoSight analyzes and correlates millions of sensors from all of our globally deployed systems. HPE InfoSight continuously learns as it analyzes this data, making every system smarter and more reliable.”
The type of AI embedded in HPE Infosight is known as machine learning (ML). ML systems can learn from large datasets without needing to be explicitly programmed. These systems detect patterns in the data and correlate them with specific outcomes, enabling predictions. An example is a loan default system. Such a system can look at thousands of variables across borrowers, from spending habits to geographic location, to employment to macro-economic conditions, and calculate a probability of default. These predictive capabilities enable lenders to proactively reach out and offer help to borrowers before a default occurs.
HPE InfoSight employs ML technology to find patterns in terabytes of data gathered from HPE systems around the world and correlate those patterns with system failures. Just as ML enables lenders to head off defaults, HPE InfoSight enables IT administrators to avoid downtime by applying preventative maintenance to systems that are at risk of failure.
Initially Developed for Storage Arrays by HPE Nimble
HPE acquired InfoSight along with the purchase of Nimble Storage in 2017. Nimble originally developed InfoSight for their storage arrays. HPE is extending InfoSight to additional products, including servers. To understand how HPE InfoSight uses machine learning to reduce operating costs, it is helpful to understand how Nimble first designed and deployed the technology.
Reliability has always been an important selling point for storage vendors. Nimble realized, however, that statistically, enterprise storage itself is very reliable and that when problems arise delivering data to applications, the cause is likely elsewhere in the infrastructure. They coined the term “app-data gap” to describe the infrastructure between applications and data where things can go wrong. The app-data gap includes the storage hardware, network, configuration, virtual machine, host, etc.
Nimble also understood that the root cause of problems in the app-data gap is very hard to determine because data centers are increasingly complex. They developed InfoSight to help with that diagnosis.
Proper diagnosis and predictive analysis require lots of data. To provide this, Nimble embedded sensors into the software functions of their storage array operating system – NimbleOS. Although referred to as sensors, these are not hardware devices, but rather software counters, timers, or probes. These sensors collect data relating to the performance of the storage array but also relating to all the other infrastructure components that interact with the storage array. Nimble built-in sensors to collect data from the network fabric, the hypervisors, hosts, applications, etc. – everything that interacted with the NimbleOS and would provide data. For example, when the Nimble Storage VMWare vCenter plugin would register on a vCenter instance, the data collector in the NimbleOS would start pulling data from that specific vCenter instance and send it along to InfoSight.
As shown in the above diagram, data from VMs, hypervisors, network interface cards, switches, and storage arrays could all be collected by NimbleOS and uploaded to the InfoSight platform. Having all this data being fed to the InfoSight machine learning system provided an unparalleled ability to diagnose app-data gap infrastructure problems. In March 2017, Dimitris Krakowiaks (a Nimble Global Architect) provided this example in his blog:
“The customer was seeing huge latency spikes but didn’t call Nimble initially since the array GUI showed zero latency. They called every other vendor, nobody could see anything wrong and kept pointing the finger at the storage. After calling Nimble support, we were able to figure out that it was the server NIC that was causing issues (a NIC that had just passed all the server vendor’s diagnostics with flying colors). Replacing it fixed the problem.”
HPE’s Big Plans for InfoSight
When HPE acquired Nimble, they put together big plans for InfoSight. It was clear in 2017 that machine learning what the future of data center operations. HPE planned to take InfoSight beyond just HPE Nimble storage arrays and extend it to other storage and server products in the HPE family.
In July 2018, HPE announced InfoSight support for 3PAR storage arrays. The primary objectives of bringing 3PAR under the HPE InfoSight umbrella are to improve predictive maintenance capabilities, simplify diagnostics, and minimize the cost and disruption of outages.
HPE has continued to invest in the machine learning platform that is central to HPE InfoSight’s superior diagnostic and predictive maintenance capabilities. By pulling in data from 3PAR environments, the HPE InfoSight platform has more data to learn from. A machine learning platform gets “smarter” as it has more diverse data sets to learn from. So, as HPE expands the reach of InfoSight, the platform gets smarter and smarter.
Furthering this expansion, in November 2018, HPE announced InfoSight support for HPE ProLiant, Apollo, and Synergy servers. Moving beyond a storage-centric view of the data center, HPE InfoSight now gathers operational intelligence from sensors embedded in the server software stack. As a result, predictive analytics capabilities can alert IT managers to server component failures, as well as server security concerns like rogue login attempts.
According to the press release, “HPE InfoSight has extended predictive analytics and recommendation capabilities to HPE servers, enabling smarter, self-monitoring infrastructure”. For example, a recommendation engine is now available that leverages machine learning to detect patterns or signs of abnormality and provide instructions to eliminate performance bottlenecks on servers.
HPE InfoSight for Servers
Bringing InfoSight onto their server platforms represents a huge step forward in HPE’s plans to build an ecosystem around real-world data analytics across the entire data center. These days, IT staff are stretched thin, but their data centers are becoming more and more complex. No matter how experienced, and well-trained, IT administrators don’t have the bandwidth to process all the information needed to diagnose infrastructure issues and optimize configurations in these complex environments.
HPE aims to use InfoSight to provide IT administrators with insights, optimizations, and curated recommendations that free them from much of the labor required to run an infrastructure.
Today, HPE InfoSight collects data from existing server management software including the HPE Integrated Lights Out (iLO) server management software – particularly the Active Health System (AHS) logs and the Silicon Root of Trust configuration data. Using this server data to train the machine learning systems in InfoSight enables HPE to provide valuable server maintenance information that, according to a recent HPE blog post, includes:
- Predictive data analytics for server parts failure
- Data analytics for server security
- Global operational dashboard with a consolidated view of the status, performance, and health of your server infrastructure, including system information, server warranty, and support status
- Global wellness dashboard with a consolidated view of the health of the server infrastructure, including recommendations
- Recommendations to eliminate performance bottlenecks on servers
A Game Changer – Solving IT Challenges Today
HPE InfoSight is a game-changing differentiator for the company. The short-term goal is to bring all the existing operational data (e.g., iLO) onto the InfoSight platform. But, going forward, HPE will build sensors into the core architecture of every system they sell, bringing unprecedented data collection and machine learning power to bear on the operational challenges of the entire data center.
One goal is to increasingly automate the operation of the data center, freeing up resources to focus on the business applications that define today’s businesses. Modern companies like Uber and Netflix are literally built on sophisticated IT applications. But, increasingly, all businesses are built on applications. Consider the Internet of Things (IoT) technology that can monitor every aspect of a factory. Combined with automation technology, the factory is literally becoming an application that can be programmed and reconfigured to meet the changing demands of the marketplace.
Your company needs its technologists focused on those applications that define the core of your business, not diagnosing storage latency issues or resolving intermittent CPU spikes in virtual machines. HPE’s goal is to handle those issue for your organization, through an intelligent infrastructure instrumented by sensors and monitored by AI.
Part of this revolution is the ability to learn from data centers around the world and apply that intelligence to your problems. For example, suppose that some early adopters of a new hypervisor experience problems running against a software-defined storage solution running Ceph on a cluster of HPE Apollo servers. Because these customers are running HPE InfoSight, the resolution to this problem is learned by the AI platform. A few months down the road, when your organization begins installing the same hypervisor, your HPE InfoSight system will alert you of the configuration changes needed to support the new technology in your HPE Apollo storage cluster.
While a scenario like this may seem far-off from the day-to-day concerns you face in your data center, the successful application of HPE InfoSight to routine break-fix problems is helping a wide variety of organizations save money today. Consider ticketing systems and the typical L1, L2, L3 support engineers required to keep them running. HPE InfoSight environments require very few L1 or L2 engineers. That is because HPE InfoSight can usually diagnose the nature of a problem and auto-create an L3 ticket.
Experience with HPE Nimble storage arrays indicates that 99% of cases are auto-created (i.e., InfoSight detects the problem and reports it before it is noticed by a person) and 86% of those cases are auto-resolved. For unresolved cases, an L3 engineer is assigned. A single L3 engineer can support over 200 arrays. HPE is aiming for similar productivity gains across the data center as additional technology is brought under the InfoSight umbrella.
AI is Reducing Data Center Operating Expenses Today
AI technology will have a revolutionary impact on data center operations. It promises to enable complex environments to be managed with fewer engineering staff and to shift the focus of IT management from infrastructure to business applications. This is not a theoretical prediction. The proof of concept is here today with HPE InfoSight. As HPE rolls this solution out beyond storage arrays, to servers, networking, and more, the productivity gains promise to be significant.
IIS - Your Partner for HPE InfoSight and More
International Integrated Solutions (IIS) is a managed service provider and system integrator with deep expertise in HPE InfoSight and the entire HPE product line. IIS is a distinguished HPE partner, winning HPE Global Partner of the Year in 2016 and Arrow’s North American Reseller Partner of the Year in 2017.
As your service provider, IIS brings deep expertise and experience. Having solved a myriad of problems for hundreds of customers, they bring a holistic view of the datacenter. IIS can help with:
- Sizing - providing an assessment methodology and tools to spec out your workloads.
- Predictive Analytics – leveraging InfoSight’s embedded predictive analytics, monitoring and management capabilities to deliver rich dashboards, alerts, governance, and custom reports.
- Managed Services - providing dedicated resources to monitor, manage and maintain your infrastructure so that your team can focus on business priorities.