Reducing OPEX Hinges on Highly Autonomous Systems
Grappling with nonstop growth in bandwidth consumption and the proliferation of network-based applications and services, communications service providers, cloud-native webscalers and enterprise IT organizations are all keen to manage increasing capital and operating expenses as they strive to satisfy the needs of their customers.
On the CAPEX side, Moore’s law, merchant silicon, off-the-shelf hardware, software-defined networking and open source software are helping to contain infrastructure acquisition costs. However, the OPEX side remains problematic, driven by increasing costs in network operations, service operations, IT operations, security operations, field operations and customer care.
Operating expenses are dominated by personnel costs, so utilizing people’s time more efficiently translates into reduced OPEX. New automation tools are helping operators streamline tedious and time-consuming configuration workflows. Big Data analytics can break down operational silos and rapidly generate actionable intelligence shared across operations and customer care teams.
However, breakthrough gains in operational efficiency hinge on the promise of machine learning and AI to provide the cognitive core for highly autonomous systems – self-driving networks and infrastructure in which skilled personnel are engaged hands-on as an exception rather than the rule.
Moving Beyond Statistical Analysis
So how do we get there from here? What challenges do operators need to overcome to realize this promise?
Start by recognizing that it’s still early in the game. Existing machine learning deployments are mainly applications of statistical analysis to large time series data sets. Computers are vastly superior at numerical analysis compared to the human mind but statistical algorithms are mainly confined to rapidly detecting statistical outliers and anomalies from a nominal baseline.
Machine learning and AI are more compelling when applied to multiple, diverse data sets in environments where conditions are constantly changing. How can a machine immediately recognize a behavioral anomaly the very first time it is observed and when historical data may not be relevant? Statistical algorithms rely on past data to detect patterns, so a machine learning system will be blind to new events that cannot be understood in terms of past events. Here lies the realm of true AI, involving complex analysis of multi-dimensional data using non-statistical algorithms.
AI Involves Advanced Mathematics
Moving beyond statistical analysis involves employing advanced mathematical models based on algebraic topology and differential geometry. To illustrate the need, let’s consider the widespread operational challenge of managing and prioritizing the incessant flood of events and alarms.
At one time of the day, Alarm A correlates directly with Alarm B but does not appear to relate to Alarms C or D. However, at another time, Alarm A often occurs just after Alarm C and before Alarm D but does not correlate with Alarm B. Humans possess the cognitive abilities to sort out what is happening and determine the root cause of the problem, but a statistics-based machine learning engine would be at a loss.
In environments where the operational state is highly dynamic, AI based on advanced mathematical models can emulate human insight to identify and track these changing relationships between events over time, enabling operators to differentiate anomalies from normal conditions and rapidly take the necessary corrective action.
Cognitive Core For Closed-Loop Automation
The human mind excels at detecting patterns and differentiating between normal and abnormal behaviors. Developing autonomous operational systems involves creating AI that can match the cognitive abilities of skilled operators while processing the vast amounts of data overwhelming human operators. AI based on advanced mathematical techniques provides a contextually aware, cognitive core for closed-loop automation, driven by telemetry data and real-time analytics. Properly engineered systems can automatically detect problems, identify the cause and take corrective action without operator intervention.
Partner Wisely to Avoid the Long and Winding Road
Due to the relationships and dependencies between data sets in large-scale operational environments, domain-specific knowledge is required to factor in the context in which machine learning and AI algorithms will be applied. Data scientists and engineers must first curate the data to be analyzed and then choose the appropriate algorithms that will be used to extract insights from the data for each specific use case. In order to streamline and automate workflows, development teams must combine the relevant operational experience with expertise applying advanced machine learning and AI techniques.
In my previous blog post, I noted that Big Data talent is in short supply across all industries. Unfortunately, machine learning and AI talent is even more scarce. The webscaler giants have hired many of the skilled practitioners as well as entire research teams from leading academic institutions. At one time in 2017, Amazon had 1178 AI jobs posted and Google listed 573 and the AI skills crisis has been widely reported in the business press.
As a consequence, organizations developing AI-based operational intelligence solutions in-house face a long and winding road not only to recruit and retain the necessary talent but to cultivate the expertise needed to apply the relevant advanced AI techniques to each operational domain.
A more realistic approach is for IT managers and operations teams to partner with a solution provider that has extensive expertise in machine learning and AI combined with a proven track record applying the technology in large-scale, complex operational environments. AI is rapidly evolving, multi-faceted technology and off-the-shelf open source and cloud-based solutions can’t be utilized directly in these environments without extensive customization, if at all.
Choose your partner wisely in order to avoid taking the long and winding road to deploying cognitive systems for highly autonomous operations. You’ll save time and money while more rapidly achieving the goal of reducing OPEX.
Image attributions: Bigstockphoto.com