(DRAFT, work in progress last update 2019.08.05)
Monitor Priority and Pause Frames for RDMA (RoCE) with Mellanox QoS
Examples of Physical Switch configuration for Mellanox, Cisco, HPE, Fujitsu and Dell
For all the physical switch examples, always get your hardware vendor to verify your settings before use.
Storages Spaces Direct (S2D) clusters with 2,4 or 6 nodes, Windows Server 2016 and Windows Server 2019
The Host and Switch need to be configured for DCB/PFC/ETS
Note: Microsoft does not currently support DCBx/LLDP. You need to be sure that the Switch is not set to "PFC Auto" on the used ports. But use "PFC Enable" or "PFC On"
Use the Microsoft Deployment Guide and Validate DCB Scripts for Host validation
In the Examples we use Priority 3 for Storage and Priority 5 or 7 for Cluster Traffic.
- Storage Traffic = SMB 445 Traffic (Storage, CSV and Live Migration)
- Cluster Traffic = UDP 3343 Traffic
(RDMA/RoCE Max Frame Size is in Auto Mode).
The pictures below shows the Priority 3 (SMB) and Pause Frames (Rcv/Sent) on the pNIC with 5-10 minuttes between each other. We can see that the Frames increased, so the system sent and recived it correct. (There need to be high load on the Physical Switch Port and/or Network Adapter before you will see that values change).
Sent Pause Frames
The total number of pause frames sent from this priority to the far-end port. The untagged instance indicates the number of global pause frames that were sent.
Sent Pause Duration
The total duration of packets transmission being paused on this priority in microseconds.
Received Pause Frames
The number of pause frames that were received to this priority from the far-end port. The untagged instance indicates the number of global pause frames that were received.
Received Pause Duration
The total duration that far-end port was requested to pause for the transmission of packets in microseconds.
Sent Discard Frames
The number of packets discarded by the transmitter. Note: this counter is per Traffic Class (TC) and not per priority.
The pictures below shows the Priority 3 (SMB) and also Priority 5 or 7 (Cluster)
(There need to be high load on the Physical Switch Port and/or Network Adapter before you will see that values change for Pause Frames).
The Priority 5 counters are all 0 (Before configuration of the Cluster QoS Settings).
Now we add the QoS for Cluster traffic
# Create QoS policies and tag each type of traffic with the relevant priority
New-NetQosPolicy "Cluster" -Cluster -PriorityValue8021Action 5
New-NetQosPolicy "SMB" -NetDirectPortMatchCondition 445 -PriorityValue8021Action 3
New-NetQosPolicy "DEFAULT" -Default -PriorityValue8021Action 0
The Microsoft SET Switch will only tag SMB Traffic, to get the Cluster traffic tag you need to enable -IeeePriorityTag
Set-VMNetworkAdapter -ManagementOS -Name "MGMT" -IeeePriorityTag on
Set-VMNetworkAdapter -ManagementOS -Name "SMB1" -IeeePriorityTag on
Set-VMNetworkAdapter -ManagementOS -Name "SMB2" -IeeePriorityTag on
For the compleat QoS configuration please review the Microsoft Network Guide or my Blog that show the lab configuration.
- Windows Server 2016 and 2019 RDMA Deployment Guide
- SMB Traffic use PFC/ETS
- Cluster Traffic use ETS
Changed all Host to tag Cluster Traffic. We now see Sent and also Received Traffic with Priority 5