Configuring SR-IOV

Overview

The Single Root I/O Virtualization (SR-IOV) specification allows splitting a single physical port (called physical function or PF) into multiple virtual ports known as virtual functions or VFs.

VFs as well as PFs can be assigned to Openstack instances, bypassing the host network stack and significantly improving performance while reducing host resource usage.

Hardware offloading, also known as switchdev mode, is a mechanism that enables Open vSwitch (OVS) flows to be offloaded directly to the Virtual Function (VF). This approach combines the performance advantages of SR-IOV with the flexibility of OVS, allowing VFs to be added to the OVS bridge. As a result, it supports the full OVN stack and facilitates overlay tenant networks that rely on tunneling protocols such as VXLAN or Geneve.

Prerequisites

1. Make sure that the network adapters support SR-IOV and optionally the hardware offloading feature.

  1. Enable the SR-IOV and VT-d settings in BIOS.

3. Add the following kernel parameters, allowing VFs to be exposed to virtual machines.

intel_iommu=on iommu=pt pci=realloc pci=assign-busses
  1. Optional: enable switchdev mode.

In this example, we are configuring a Mellanox ConnectX-6 device. Please check your vendor documentation for other network adapter models.

ifname=enp3s0f0np0
pciaddr=0000:02:00.0
sudo devlink dev eswitch set pci/${pciaddr} mode switchdev
sudo ethtool -K $ifname hw-tc-offload on

Verify that hardware offloading is enabled:

sudo devlink dev eswitch show pci/${pciaddr}
# output #
pci/0000:03:00.0: mode switchdev inline-mode none encap-mode basic
sudo ethtool -k $ifname | grep hw-tc-offload
# output #
hw-tc-offload: on
  1. Initialize the desired number of VFs

echo '4' | sudo tee /sys/class/net/${ifname}/device/sriov_numvfs

This setting can be added to the Netplan configuration in order to survive reboots:

enp3s0f0np0:
  virtual-function-count: 4
  embedded-switch-mode: "switchdev"

Verify that the VFs were created successfully:

$ sudo lshw -class net -businfo
# output #
Bus info          Device          Class          Description
============================================================
pci@0000:04:00.0  enp3s0f0np0     network        MT2892 Family [ConnectX-6 Dx]
pci@0000:04:00.1  enp3s0f1np1     network        MT2892 Family [ConnectX-6 Dx]
pci@0000:04:00.2  enp4s0f0v0      network        ConnectX Family mlx5Gen Virtual Function
pci@0000:04:00.3  enp4s0f0v1      network        ConnectX Family mlx5Gen Virtual Function
pci@0000:04:00.4  enp4s0f0v2      network        ConnectX Family mlx5Gen Virtual Function
pci@0000:04:00.5  enp4s0f0v3      network        ConnectX Family mlx5Gen Virtual Function
pci@0000:01:00.0  eno1            network        82599ES 10-Gigabit SFI/SFP+ Network Connection
pci@0000:01:00.1  eno2            network        82599ES 10-Gigabit SFI/SFP+ Network Connection
pci@0000:08:00.0  eno3            network        I350 Gigabit Network Connection
pci@0000:08:00.1  eno4            network        I350 Gigabit Network Connection
pci@0000:82:00.0  enp130s0f0      network        Ethernet 10G 2P X520 Adapter
pci@0000:82:00.1  enp130s0f1      network        Ethernet 10G 2P X520 Adapter
pci@0000:04:00.0  enp4s0f0r0      network        Ethernet interface
pci@0000:04:00.0  enp4s0f0r1      network        Ethernet interface
pci@0000:04:00.0  enp4s0f0r2      network        Ethernet interface
pci@0000:04:00.0  enp4s0f0r3      network        Ethernet interface

If hardware offloading is enabled, additional representor functions may be automatically created for each VF.

  1. Ensure that you’re using snapd 2.71 or later.

Manual mode

During bootstrap, Canonical Openstack will determine if there are any SR-IOV capable network devices.

If so, the user will be asked to specify which SR-IOV devices should be exposed to Openstack tenants and the name of the corresponding Neutron physical network, also known as physnet.

No physnet should be specified when using hardware offloading and overlay networks such as VXLAN or Geneve.

Example:

sunbeam cluster bootstrap --role control,compute
# output #
# ...
Configure SR-IOV? [y/n] (n): y
Found the following SR-IOV capable devices:
  [ ] Mellanox Technologies MT2892 Family [ConnectX-6 Dx] (enp3s0f0np0) [physnet: None]
  [ ] Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (eno1) [physnet: None]
  [ ] Mellanox Technologies MT2892 Family [ConnectX-6 Dx] (enp3s0f1np1) [physnet: None]
  [ ] Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (eno2) [physnet: None]
  [ ] Intel Corporation Ethernet 10G 2P X520 Adapter (enp130s0f0) [physnet: None]
  [ ] Intel Corporation Ethernet 10G 2P X520 Adapter (enp130s0f1) [physnet: None]
Add network adapter to PCI whitelist? Mellanox Technologies MT2892 Family [ConnectX-6 Dx] (enp3s0f0np0) [y/n] (n): y
Specify the physical network for Mellanox Technologies MT2892 Family [ConnectX-6 Dx] (enp3s0f0np0) or pass 'no-physnet' if using hardware offloading with overlay networks: no-physnet
Add network adapter to PCI whitelist? Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (eno1) [y/n] (n):
Add network adapter to PCI whitelist? Mellanox Technologies MT2892 Family [ConnectX-6 Dx] (enp3s0f1np1) [y/n] (n):
Add network adapter to PCI whitelist? Intel Corporation 82599ES 10-Gigabit SFI/SFP+ Network Connection (eno2) [y/n] (n):
Add network adapter to PCI whitelist? Intel Corporation Ethernet 10G 2P X520 Adapter (enp130s0f0) [y/n] (n):
Add network adapter to PCI whitelist? Intel Corporation Ethernet 10G 2P X520 Adapter (enp130s0f1) [y/n] (n):

All the VFs that belong to the specified SR-IOV PFs will be added to the Nova PCI whitelist, in addition to the devices that may have been specified in the manifest file.

The openstack-hypervisor snap determines if the specified adapters support hardware offloading. If not, it will configure the Neutron SR-IOV agent to handle these ports.

The SR-IOV configuration may be subsequently modified using the following command:

sunbeam configure sriov

MAAS mode

When deploying Canonical Openstack in MAAS mode, set one of the following network interface tags to expose SR-IOV adapters:

sriov:<physnet>
sriov:no-physnet

Use Curtin scripts to prepare the prerequisite SR-IOV configuration as described in the previous section. Also ensure that MAAS is configured to apply the necessary kernel parameters.

Similarly to manual mode, the SR-IOV configuration can be modified using the following command:

sunbeam configure sriov

Manifest configuration

Arbitrary PCI devices may be whitelisted through the Canonical Openstack manifest. Apart from SR-IOV network adapters, this can also include vGPUs or FPGAs.

Example:

pci:
  device_specs:
    - address: "0000:1b:00.0"
      vendor_id: "8086"
      product_id: "1563"
      physical_network: "physnet1"
  excluded_devices:
    r740-dc1-ceph.maas:
      - "0000:19:00.0"
      - "0000:19:00.1"
      - "0000:1b:00.1"
      - "0000:5e:00.0"
  aliases:
    - vendor_id: "8086"
      product_id: "1563"
      device_type: type-PF
      name: "intel-pf"

The device spec filters are highly flexible and can contain PCI address wildcards or PCI vendor/product IDs. See the Nova device spec reference for more details.

The device whitelist will be applied to all the compute nodes. If needed, use the exclusion list to define per-node lists of devices that should not be exposed to Openstack instances.

Configured PCI device aliases may be requested through Nova flavor extra specs.

Attaching SR-IOV VFs to Openstack instances

Launch a demo instance:

sunbeam launch --name test

Create a port with --vnic-type=direct:

openstack port create --network demo-network --vnic-type=direct direct-port

Attach the port:

openstack server add port test direct-port

Check the port status:

openstack port show direct-port
# output #
+-------------------------+----------------------------------------------------------------------------------+
| Field                   | Value                                                                            |
+-------------------------+----------------------------------------------------------------------------------+
| admin_state_up          | UP                                                                               |
| allowed_address_pairs   |                                                                                  |
| binding_host_id         | None                                                                             |
| binding_profile         | None                                                                             |
| binding_vif_details     | None                                                                             |
| binding_vif_type        | None                                                                             |
| binding_vnic_type       | direct                                                                           |
| created_at              | 2025-07-29T09:37:26Z                                                             |
| data_plane_status       | None                                                                             |
| description             |                                                                                  |
| device_id               | 1dd2e5a2-011c-4ab2-abb0-b21ee6b355a8                                             |
| device_owner            | compute:nova                                                                     |
| device_profile          | None                                                                             |
| dns_assignment          | fqdn='test.cloud.sunbeam.internal.', hostname='test', ip_address='192.168.0.227' |
| dns_domain              |                                                                                  |
| dns_name                | test                                                                             |
| extra_dhcp_opts         |                                                                                  |
| fixed_ips               | ip_address='192.168.0.227', subnet_id='782b4f8b-0f05-4725-98e6-1519d44f3458'     |
| hardware_offload_type   | None                                                                             |
| hints                   |                                                                                  |
| id                      | c240b03c-014d-4901-89a0-876f72c94aaf                                             |
| ip_allocation           | immediate                                                                        |
| mac_address             | fa:16:3e:66:b9:b2                                                                |
| name                    | direct-port                                                                      |
| network_id              | 578cb555-0972-4177-9739-85d29bd67ff1                                             |
| numa_affinity_policy    | None                                                                             |
| port_security_enabled   | True                                                                             |
| project_id              | d081abb7eebc4279a8e8ca7ddcf7ecae                                                 |
| propagate_uplink_status | True                                                                             |
| resource_request        | None                                                                             |
| revision_number         | 41                                                                               |
| qos_network_policy_id   | None                                                                             |
| qos_policy_id           | None                                                                             |
| security_group_ids      | 5362283f-56e2-443a-a952-bfbdf18cfb06                                             |
| status                  | ACTIVE                                                                           |
| tags                    |                                                                                  |
| trunk_details           | None                                                                             |
| updated_at              | 2025-07-29T10:12:33Z                                                             |
+-------------------------+----------------------------------------------------------------------------------+

Verify the Libvirt domain:

$ sudo openstack-hypervisor.virsh dumpxml instance-00000001 | grep "type='hostdev" -A 8
<interface type='hostdev' managed='yes'>
  <mac address='fa:16:3e:66:b9:b2'/>
  <driver name='vfio'/>
  <source>
    <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x5'/>
  </source>
  <alias name='hostdev0'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
</interface>

If hardware offloading is available, the device will be added to the br-int bridge:

sudo openstack-hypervisor.ovs-vsctl show
# output #
f9b527db-207c-453d-bcda-482610541462
    Bridge br-ex
        datapath_type: system
        Port br-ex
            Interface br-ex
                type: internal
        Port patch-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055-to-br-int
            Interface patch-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055-to-br-int
                type: patch
                options: {peer=patch-br-int-to-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055}
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port enp2s0f0r3
            Interface enp2s0f0r3
        Port tap578cb555-00
            Interface tap578cb555-00
        Port br-int
            Interface br-int
                type: internal
        Port patch-br-int-to-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055
            Interface patch-br-int-to-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055
                type: patch
                options: {peer=patch-provnet-4cb61b1f-86a8-4c50-956a-04d8358ce055-to-br-int}
        Port tapdcf0ee2d-f8
            Interface tapdcf0ee2d-f8
    ovs_version: "3.5.0"