It’s been estimated that in the next three to five years years the number of connected devices will reach a staggering 50 billion globally.
Even if that number sounds extreme, it’s undeniable that advancements in silicon technology (e.g. shrinking of computing and sensor components) and the evolution of 5G networks will definitely drive the rise of capable edge devices and create much more relevant use cases.
Given that scenario, when it comes to the technology to support it, several associated challenges need to be identified and addressed first.
The recent OpenDev conference aimed to raise the awareness and foster collaboration in this domain. The topic of autonomous workload management at the edge was one of the working sessions at the conference. It focused on technical constraints, requirements and operational considerations when an application/workload has to be orchestrated at the edge. The main assumption is that several edge computing use cases (e.g. micro-edge/mobile edge such as set-top boxes, terminals etc.) will demand highly autonomous behavior due to connectivity constraints, latency, cost etc. The scale of edge infrastructure might also drive this autonomous behavior of the edge platform while the central management of all these edge devices will be an enormous task. To this end, several operational considerations were discussed as summarized below.
Workload orchestration
When it comes to autonomous workload management at the edge, effective orchestration is the most important issue. No matter if it is on bare metal, virtual machines or application containers, the need for automation and use of software-defined methodologies is apparent. In the NFVi world nowadays there’s a clear tendency towards model-driven declarative approaches such as TOSCA for addressing orchestration. The expectation is that the edge platform should include a functional component responsible not only for the orchestration and management of resources (e.g. VMs or container) but also the running applications or VNFs. Such an entity takes care of runtime healing and scaling as well as provisioning (edge-side or centrally triggered). Even if the goal is the autonomous operation of the workload orchestration, it’s expected that there will be some kind of central orchestration entity (or regional manager) that will still keep a reference of the edge state or drive provisioning of the edge. It feels like the absolute autonomous behavior of a mesh-like edge network is a bit futuristic and difficult to have in the short term, at least at large scale.
State and policy management
Autonomous workload orchestration also implies autonomous state management. In order to effectively orchestrate the state not only of the hosting platform (e.g. the virtual machine or container) has to be captured but also the services or application state should be monitored. Today, most of state management operations are handled by orchestration components (at the resources level) and the applications/VNF vendors themselves. However, there is no combined view of the state which results in pretty primitive fault handling: when the state of a workload is faulty, then the whole resource is restarted. In addition, the Service Function Chaining (SFC) or the applications’ micro-services paradigm introduce a composable state concept which potentially has to be considered. State abstraction is also important: a regional orchestrator might not need to keep the state of all components of an SFC but just an abstracted state of the whole SFC. On the other hand, the edge orchestrator must know the state of each service in the chain. The policy enforcement should also follow the same pattern with the state propagation. All-in-all, the above-mentioned points suggest the need for a more capable state management structure that can tackle these new requirements. Whether the existing orchestrators are responsible for these features or new modules and/or protocols have to be invented is up for discussion.
Managing packages of local repositories
Autonomous operation of the workload orchestration at the edge requires local repositories of images and software packages. Assuming that the connection of the edge device to the core network is either unreliable or very thin, only the control plane operations should be transferred over the air. It’s being suggested that the orchestration systems should explore cached or local repositories that would be synchronized on-demand by some central component. Multicast features for pushing updates to the edge repositories should be considered too, especially if the scale of edge devices increases exponentially.
Even if most of the issues and ideas discussed here are nor new neither unique, we can’t assume that the technologies and systems developed for data center operations can directly solve edge computing use cases. Perhaps it’s best to look at the problem with a fresh perspective and try to architect an edge computing platform that could serve all use cases (from micro to large edge), leveraging existing technology where and when possible but also investing in new.
About the author
Gregory Katsaros is a services, platforms and cloud computing expert who has been working for several years in research and development activities and projects related to the adoption of such technologies and transformation of the industry. Katsaros has special interest in services and resources orchestration as well as network function virtualization and software defined networking technologies.
He holds a Ph.D. from the National Technical University of Athens on “Resource monitoring and management in Service Oriented Infrastructures and Cloud Computing” and has been contributing to research communities with research and code.
In the last few years, he’s been leading projects related to services and cloud computing, distributed systems, interoperability, orchestration, SDN, NFV transformation and more. He’s interested in transforming the telco and enterprise sector by embracing automation and orchestration technologies, as well as investigating edge computing.
Cover Photo // CC BY NC
- Paths to autonomous workload management at the edge - September 21, 2017