Like much of the cloud computing world, open infrastructure has seen an explosion of use cases around AI and ML. In partnership with our clients at Graphcore, StackHPC’s John Garbutt was thrilled to present our recent work on implementing open infrastructure support for Graphcore’s groundbreaking Intelligence Processing Unit (IPU).
Graphcore’s radical approach to AI has resulted in a processor capable of unprecedented performance in AI model training and inference. With StackHPC’s assistance, that powerful AI resource can now be harnessed and provided as a service by OpenStack clouds around the world.
The full keynote session is online here:
Presenting our Recent Work on Magnum
For modern compute platforms, OpenStack and Kubernetes make a natural combination. OpenStack’s Magnum service provides easy deployment and management of Kubernetes clusters on OpenStack infrastructure. However, the project has struggled lately to keep up with the pace of development in the Kubernetes ecosystem. A significant reason for that has been the overhead of developing and maintaining the implementation of the current Kubernetes driver.
Step forward not one project but two, to implement alternative Magnum Kubernetes drivers based on ClusterAPI. StackHPC’s Matt Pryor and Mohammed Naser from VEXXHOST presented recent development work on two ClusterAPI drivers that significantly improve the agility of the Magnum project and promise to dramatically improve the experience for LOKI users.
StackHPC’s work is being developed as a community effort through the four opens, and will be described in detail in a forthcoming blog post here.
Azimuth: Accessible Compute Platforms on LOKI
StackHPC’s experience in compute platforms for research has culminated in the creation and development of Azimuth cloud portal, which was first presented at OpenInfra Berlin 2022, and covered in a previous StackHPC blog post.
In Vancouver, John Garbutt and Matt Pryor presented an overview of the Azimuth cloud portal, including new features, and details of the roadmap for ongoing development.
Kolla User Forum
StackHPC’s Michal Nasiadka is the Project Technical Lead (PTL) for the Kolla project, which includes Kolla-Ansible and Kayobe. Michal led a well-attended forum session for user feedback on the Kolla project.
The Kolla development roadmap was presented and discussed, including support for new host OSes (Rocky Linux 9 and Ubuntu 22.04), plans for adding Podman support, improvements for RabbitMQ configuration and OpenStack services proposed for deprecation in Kolla.
Users also brought interesting feedback and feature requests in several areas:
- Performance enhancements, including using Mitogen
- A request for improved Kolla implementation of new RBAC support for OpenStack projects.
- Pain points relating to unreliable delivery of log messages.
- The possibility of deploying multiple RabbitMQ instances.
Kubernetes and OpenStack Forum
The rising star of Kubernetes plays an increasing role, both as a platform workload and also as a method for orchestrating the OpenStack services themselves.
StackHPC participated in a well-attended discussion about Kubernetes, ClusterAPI and the OpenStack Cloud Controller Manager (OCCM), as used for the Azimuth cloud portal.
Hypervisor Upgrades Forum
In collaboration with our colleagues at G Research, John Garbutt ran a well-attended and lively forum on hypervisor upgrades for production clouds. The event, somewhat like a self-help group in dynamic, was a constructive discussion about best practices for maintaining OpenStack compute hypervisors without causing disruption to production services.
Announcing OpenInfra Europe
StackHPC is proud to be an inaugural supporter of OpenInfra Europe. This regional grouping has been created to advocate for European issues, including governance, sovereignty, data protection and common interests.
OpenInfra Europe is intended to be complementary to rather than replacing existing interest groups, such as the Scientific SIG.
The Scientific SIG at Vancouver
The Scientific SIG met as part of the PTG session running as a companion event to the summit. About 25 attendees participated in a session led by Stig Telfer (StackHPC’s CTO) and Martial Michel, including both familiar faces and new joiners.
As usual, the social aspect of the gathering was the most fruitful and global connections were created and cemented.
What a great summit!
Reposted with permission from StackHCP, the original article can be found here.
- Kubernetes, RDMA and OpenStack - October 19, 2023
- StackHPC at Vancouver OpenInfra Summit - October 8, 2023
- What’s on the horizon for high-performance computing - June 8, 2018