Identify potential bottlenecks in the Neutron API, plan a performance testing and more

image

Last year an informal collaboration was started between OpenStack Neutron project members dealing with performance (aka neutron-perf team, now merged to the core team activities since the last gathering in China) and members of the Discovery Open Science Initiative, a project aiming at a fully decentralized IaaS

The main goal of this collaboration is to identify potential bottlenecks in the Neutron API and plan a performance testing. Since 2016, the Discovery initiative has acquired expertise on testing using OpenStack as reference middleware.

To this aim, they developed EnOS, a tool to deploy, customise, and benchmark OpenStack targeting reproducible experimentations, Leveraging the OpenStack Kolla Ansible project, EnOS enables the execution of performance stress workloads on OpenStack for postmortem analysis. It supports large scale platforms such as the French dedicated testbed for research Grid’5000 or Vagrant configurations for local testing performed during development. Results of some experiment campaigns have been shared to the OS community in several summits.  During the Berlin summit, Miguel Lavalle (former Neutron PTL) and Discovery members exchanged ideas on the ongoing work related to the performance and scalability challenges of the Neutron project. 

Since the initial conversation, Neutron members defined Rally scenarios examples to isolate their potential concerns. They also implemented a way to get access to detailed internals information, focusing on the messages exchanges and database access. At the same time, Discovery members have released new versions of EnOS taking into account the Neutron feedback.

After some preliminary tests, it looks that Neutron and Discovery members are ready to take the next step combining their knowledge and resources. This may enable a new way to benchmark Neutron. So far testing in the OpenStack projects has been performed mainly to validate the specifications and avoid regressions. Now together a test plan at large scale may be defined with hundreds of nodes including different scenarios such as for network partition, fault aggregation, and stress load. The main idea is to identify the major limits of the whole infrastructure and evaluate the required modifications to ensure the evolution of OpenStack.

These activities show a bit of the history, the reusing intention of the available features of different components of OpenStack, and the relevance of the OpenStack gatherings. Collaborations like this one let diverse actors meet and collaborate crossing traditional limits of academic and industrial communities.