The most hyped areas of cloud computing include cloud native development, the use of artificial intelligence and machine learning, big data, and other areas of cloud deployments that people enjoy discussing when they talk about the future of cloud. In contrast, CloudOps (cloud operations – the management of a complex cloud deployment) is a topic that rarely gets that kind of attention.
Enterprises that didn’t pay much attention to CloudOps are now running into a wall as they deploy more applications and data stores to the cloud. This includes net-new cloud native applications as well as lift-and-shift applications.
The Two Main Challenges: Where We’re Failing
The difficulty with CloudOps comes from two different directions, which feed into one another:
Growing Complexity
First, complexity slows or stops cloud-based workload deployments and operations. Informally, this is called the “complexity wall.” This is the result of several years of rapid movement to public cloud providers, including multicloud.
The problem is that movement typically took place without regard for the use of common services such as security and monitoring systems. Everyone provisioned whatever they thought was best-of-breed for their solution. So the number of technology stakes grew to a point that no enterprise can pay for the number of skills and tooling needed to operate these systems. In other words, the amount of complexity made them too costly and too risky to operate in their current state.
Consider this report entitled Bridging the Cloud Transformation Gap that evaluates the findings of Aptum’s Global Cloud Impact Study. Some 62 percent of respondents cited complexity and abundance of choice as a hindrance when planning a digital transformation that leverages cloud.
Lack of Planning
Second, CloudOps should have specific planning in place before the migrations and development take place. The typical enterprise moves to the cloud with no idea as to how they will operate those applications and handle the data once they are there. The lack of planning also causes too much complexity due to a lack of ongoing coordination.
Part of the solution is simple: Add planning as part of the process to move to or build on the cloud. You also need to create standard procedures about how things should operate, and design net-new and migrated systems in ways that result in better operations. This means building things such as management APIs that include more self-reporting data from the cloud-based systems, bespoke for CloudOps.
Also see: The New Focus on CloudOps: How Enterprise Cloud Migration Can Succeed
Trends Affecting the Future of CloudOps
Some of the more progressive enterprise pioneers who are moving to or building on clouds are just now hitting the complexity wall. Most enterprises with workload migrations below 20 percent do not yet understand that this limitation exists.
However, the cloud operations tool market has seen a steady rise in interest with CloudOps-related tools. These tools include cloud management and monitoring, AIOps (AI operations), FinOps (financial operations), and API and resource governance.
Moreover, there is more focus on CloudOps-related skills that include those who can operate the tools we just listed. But more important are those expert staffers who understand how to create and implement operations models (ops models), including the proper configuration of technologies and humans for the most cost efficiency and lowest risk. There are still too many open CloudOps positions chasing too few candidates, and this will likely get worse in 2022 and 2023.
A few key trends are emerging right now that will affect the future of CloudOps. They include:
- Lack of available CloudOps skills. As mentioned above, this includes both technology and CloudOps tool SMEs, and even people who can perform more traditional ops-related work such as system backups, runtime monitoring, and even fixing simple problems such as restarting cloud-based systems as needed.
- Static budget. Typically, no additional funding is available for CloudOps than was available for traditional operations.
- Rising security threats. These are related to CloudOps, such as ransomware attacks and denial of service attacks.
- Rise of the use of atypical platforms. Examples include high-performance computing and even use of quantum systems. Each requires specialized operations and different sets of skills.
- The shift to a utility consumption model. Meaning that we no longer focus on data centers and our own hardware and software, but virtual systems that we’ll never see. We pay for this hardware and software as-a-service, and thus there are different rules to measure cash burn, and different metrics to gauge success.
The CloudOps technology and tool vendors react to these trends in very different ways. Their solutions to the challenges listed above also vary in different ways. Some will prevail in the future, while others will fall by the wayside. The market will call the balls and strikes over the next 2-3 years.
Possible Solutions: What’s CloudOps’s Future?
It’s not difficult to pick the most likely events that will occur around CloudOps, both as a concept and as a set of technologies. Consider the emerging problems and envision how enterprises and tool providers will approach and solve those problems.
That was easy to say, maybe not so easy to do. To give you a leg up, here are a few events you’ll probably see in the future – including some you probably didn’t see coming.
We Face the Operational Complexity Around Cloud Deployments
While many understand it’s there, most don’t have an approach or sets of technology needed to solve the problems. Count on this becoming an area of discipline, with tools placed directly in the operational complexity problem-solving business.
CloudOps Tools Evolve Around the Use of AI and Automation
These tools exist today, but few provide the depth of automation that CloudOps will require in the future. While automation is loosely coupled with these tools today, as is AI, they will have to morph to solve an increased array of challenging problems.
Self-healing through AI and automation will be more in demand. Not just for simple problems such as restarting a database to work around a poorly performing cache, but to deal with complex failures that have difficult to diagnose root causes and as many as a hundred different components that must be fixed in specific ways, and in specific sequences.
This will stretch the limitation of automation and AI, and each must be employed if CloudOps staffers are to get ahead of the operational issues they need to solve that are clearly in their future.
CloudOps Skills Become More Widely Defined
If you think of CloudOps today, you’ll likely think of an employee who stares at a screen watching for graphics to turn from green to red. However, the future of CloudOps will be much more complex and include new specialized roles. These roles will include CloudOps staff who specialize in DevOps-related operations, those who focus on database operations, others who focus on security operations, and so forth.
The issue is that these new roles will come with their own sets of challenges that are unique to each area. The way you deal with operations needs to be unique as well. Specialized tools will also follow, with more CloudOps tools focused on specific domains rather than on holistic CloudOps.
If you’re looking for gainful employment for the next 10 to 15 years, this is a good area to focus on. As we move across the 50 percent mark of enterprise systems running in the clouds, operational challenges will follow. These challenges need to be solved by specialized tools and staff, or cloud will be dead in its tracks.