What is NVMe over fabrics, anyway?
The evolution of NVMe interface protocol is a boon to SSD-based storage arrays. It further powered SSDs to obtain high performance and reduced latency for accessing data. Benefits further extended by NVMe over fabrics network protocol which brings NVMe feature retained over network fabric while accessing the storage array remotely. Let’s take a look at how.
While leveraging NVMe protocol with storage arrays consists of high-speed NAND and SSDs, latency was experienced when NVMe-based storage arrays access through shared storage or storage area networks (SAN). In SAN, data should be transferred between the host (initiator) and the NVMe-enabled storage array (target) over Ethernet, RDMA technologies (iWARP/RoCE), or Fibre Channel. Latency caused due to a translation of SCSI commands into NVMe commands in process of transportation of data. To address this bottleneck, NVM express introduced NVMe over fabrics protocol to get replaced with iSCSI as storage networking protocol. With this, NVMe benefits taken onto network fabrics in SAN kind of architecture to have a complete end to end NVMe based storage model which is highly efficient for new age workloads. NVMe-oF supports all available network fabrics technologies like RDMA (RoCE, iWARP), Fibre Channel (FC-NVMe), Infiniband, Future Fabrics, and Intel Omni-Path architecture.
NVMe over fabrics and OpenStack
OpenStack consists of a library of open-source projects for the centralized management of data center operations. OpenStack provides an ideal environment to implement efficient NVMe based storage model for high throughput. OpenStack Nova and Cinder are components used in proposed NVMe-oF with OpenStack solution. This consists of creation and integration of Cinder NVME-oF target driver along with OpenStack Nova.
OpenStack Cinder is a block storage service project for OpenStack deployments mainly used to create services which provide persistent storage to cloud-based applications. It provides APIs to users to access storage resources without disclosing storage location information.
OpenStack Nova is component within OpenStack which helps is providing on-demand access to compute resources like virtual machines, containers and bare metal servers. In NVMe-oF with OpenStack solutions Nova in attaching NVMe volumes to VMs.
Support of NVMe-oF in OpenStack was made available from the Rocky release. A proposed solution requires RDMA NICs and supports kernel initiator and kernel target.
NVMe-oF targets supported
Based on the proposed solution above we get two choices to implement NVMe-oF with OpenStack. First, with a Kernel NVMe-of target driver supported from OpenStack Rocky release onward. The second implementation is Intel’s SPDK-based NVMe-oF implementation containing SPDK NVMe-oF target driver and SPDK LOLVOL (logical volume manager), back end which is anticipated in the upcoming OpenStack Stein release.
Kernel NVMe-oF Target (supported from the OpenStack Rocky release)
Here is the implementation consist of support for kernel target and kernel initiator. But kernel based NVMe-oF target implementation has limitations in terms of number of IOPs per CPU core. Also, kernel-based NVMe-oF suffers a latency issue due to CPU interrupts, many systems calls to read data and time take to transfer data between threads.
SPDK NVMe-oF Target (expected in the upcoming Openstack Stein release)
Why SPDK?
SPDK architecture achieved high performance for NVMe-oF with OpenStack by moving all necessary application drivers to userspaces (apart from the kernel) and enables operation in polled mode rather interrupt mode and lockless (avoiding the use of CPU cycles synchronizing data between threads) processing.
Let’s take a look what that means.
In SPDK implementation, storage drivers which are utilized for storage operations like storing, updating, deleting data are isolated from kernel space where general purpose computing processes run. This isolation of storage drivers from kernel saves an amount of time required for processing in the kernel and enables CPU cycles to spend more time for execution of storage drivers at user space. This avoids interruption and locking of storage driver with other general purpose computing drivers in kernel space.
In typical I/O model, application request read/write data access and waits till I/O cycle to complete. In polled mode, once application places a request for data access it goes at other execution and comes back after a defined interval to check completion of an earlier request. This reduces latency and process overheads and further improves the efficiency of I/O operations.
By summarizing, SPDK specially designed to extract performance from non-volatile media, containing tools and libraries for scalable and efficient storage applications utilized user space, and polled mode components to enable millions IO/s per core. SPDK architecture is open source BSD licensed blocks optimized for bringing out high throughput from the latest generation of CPUs and SSDs.
Why an SPDK NVMe-oF Target?
Following performance benchmarking report of NVMe-oF using SPDK, it has been noted that:
- Throughput scales up and latency decreases almost linearly with the scaling of SPDK NVMe-oF target and initiator I/O cores.
- SPDK NVMe-oF target performed up to 7.3x better w.r.t IOPS/core than Linux Kernel NVMe-oF target while running 4K 100 percent random write workload with increasing number of connections (16) per NVMe-oF subsystem.
- SPDK NVMe-oF initiator is 3x 50GbE faster than Kernel NVMe-oF initiator with null bdev based back end.
- SPDK reduces NVMe-oF software overhead up to 10 times.
- SPDK saturates 8 NVMe SSDs with a single CPU core
SPDK NVMe-oF implementation
This is the first implementation of NVMe-oF integrating with OpenStack (Cinder and Nova) that leverages NVMe-oF target driver and SPDK LVOL (Logical Volume Manager) based SDS storage backend. This provides a high-performance alternative to kernel LVM and kernel NVMe-oF target.
If compared with Kernel-based implementation, SPDK reduces NVMe-oF software overheads and yields high throughput and performance. Let’s see how this will be added to upcoming OpenStack Stein release.
This article is based on a session at OpenStack Summit 2018 Vancouver – OpenStack and NVMe-over-Fabrics – Network connected SSDs with local performance. The session was presented by Tushar Gohad (Intel), Moshe Levi (Mellanox) and Ivan Kolodyazhny (Mirantis). You can catch the demo on video here.
About the author
Sagar Nangare,a digital strategist at Calsoft Inc., is a marketing professional with over seven years of experience of strategic consulting, content marketing and digital marketing. He’s an expert in technology domains like security, networking, cloud, virtualization, storage and IoT.
This post first appeared on the Calsoft blog. Superuser is always interested in community content, get in touch: editorATopenstack.org
- Exploring the Open Infrastructure Blueprint: Huawei Dual Engine - September 25, 2024
- Open Infrastructure Blueprint: Atmosphere Deep Dive - September 18, 2024
- Datacomm’s Success Story: Launching A New Data Center Seamlessly With FishOS - September 12, 2024