This new episode of the “Large Scale OpenStack” show will discuss operators’ tricks and tools.

image

OpenInfra Live is a weekly hour-long interactive show streaming to the OpenInfra YouTube channel Thursdays at 15:00 UTC (9:00 AM CT). Episodes feature more OpenInfra release updates, user stories, community meetings, and more open infrastructure stories.

This new episode of the “Large Scale OpenStack” show will discuss operators’ tricks and tools. In a live and direct discussion, our guests will reveal tricks and homegrown tools that they use in the trenches to keep their OpenStack clusters ticking like clockwork.

Enjoyed this week’s episode and want to hear more about OpenInfra Live? Let us know what other topics or conversations you want to hear from the OpenInfra community this year, and help us to program OpenInfra Live! If you are running OpenStack at scale or helping your customers overcome the challenges discussed in this episode, join the OpenInfra Foundation to help guide OpenStack software development and to support the global community.

Belmiro Moreira, cloud architect at CERN, led the show & tell where each organization had the opportunity to present one of the tools or tricks they use to make operating large scale infrastructure easier.

Infomaniak, a Silver Member of the OpenInfra Foundation

Axel Jacquet, cloud administrator and Thomas Goirand, cloud administrator at Infomaniak kicked off the series of tools and tricks with some background on their OpenStack usage and live demos.

Infomaniak has been using OpenStack since Grizzly, a fully Swiss company with two data centers in Switzerland and a third one will be online soon. They have had a public cloud open since a few months ago with basic OpenStack services including Glance, Keynote, Nova, Cinder, Neutron and more. Magma, Designate and Manila are on their roadmap.

In this episode, they presented two tools to solve two common infrastructure management problems: HA virtual router failover and instance connectivity check.

OVH, a Silver Member of the OpenInfra Foundation

Adrien Pensart, site reliability engineer at OVH Cloud, shared how OVH’s OpenStack public cloud is composed of 20 regions, so one of the challenges is managing all of the different parts. Pensart introduced several tools that help unify the different workflows and infrastructure components and performed a demo showing host maintenance when they want to upgrade a host.

LINE

Gene Kuo, Infrastructure Software Engineer at LINE, introduced several tricks they use to manage their OpenStack clusters:

  • Serial Hypervisor Upgrade: The team splits hypervisors into different groups and upgrades them in serial. They do this to prevent multiple Neutron agents or Nova compute service from restarting at the same time. This inevitably reduces the load of RabbitMQ clusters during upgrades.
  • Retired Flag in Keystone: This eases the transition of credentials when employees leave the organization to prevent outages in production services.
  • Dynamic Hypervisor Disabling: A periodic script scans hypervisor nodes to determine load of hypervisor. If the load is too high, they temporarily disable the hypervisor to prevent new virtual machines being scheduled, avoiding a noisy neighbor issue.

Workday

Shatadru Bandyopadhyay, DevOps Engineer at Workday introduced Cloudmap, a tool that provides centralized reporting and search to all VMs in Workday across all data centers. The tool is designed to provide a simple visualization of the cloud resources, but enough to answer most of the common user’s queries and requirements. This helps overcome the challenge of relying on Horizon to visualize all of the clusters due to the scale.

CERN, an Associate Member of the OpenInfra Foundation

Moreira concluded the episode by sharing one of CERN’s tools. He introduced Cloud by Numbers—a tool that has graced an episode of OpenInfra Live before—that gives a glimpse in the scale of the CERN cloud environment.

He also introduced Migration Cycle, a tool that provides infrastructure integration and orchestration for live migrations. He said that use cases for this tool include hardware repairs, hardware retirement, and Compute Node Linux kernel upgrades.

Next Episode on #OpenInfraLive

Launched by the OpenInfra Foundation in April of this year, OpenInfra Live has assembled the global community weekly to hear the latest trends and updates straight from the community itself. Join us as we go through our OpenInfra Live highlights reel.

Tune in on Thursday, December 16 at 1500 UTC (9:00 AM CT) to watch this #OpenInfraLive episode: #BestOf OpenInfra Live.

You can watch this episode live on YouTube, LinkedIn and Facebook. The recording of OpenInfra Live will be posted on OpenStack WeChat after each live stream!

Like the show? Join the community!

Catch up on the previous OpenInfra Live episodes on the OpenInfra Foundation YouTube channel, and subscribe for the Foundation email communication to hear more OpenInfra updates!

Allison Price