*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***
Performance Monitor and configuration examples for Mellanox Switch SX1012
Monitor Priority and Pause Frames for RDMA (RoCE) on Mellanox Switch SX1012
For the screen captures in the blog I used the Mellanox Switch SX1012
- Mellanox SX1012 (SwitchX - Onyx 3.6.8010)
******************************************************************************
Mellanox Show Commands for DCB, PFC and ETS
******************************************************************************
show dcb priority-flow-control
show dcb priority-flow-control detail
show dcb ets
Review Interface
show dcb priority-flow-control interface ethernet 1/3
show dcb ets interface ethernet 1/3
show interfaces ethernet 1/3 counters priority 3
References
Microsoft:
Mellanox L2 PCP TC:
- How to Install Windows Server 2016 with RoCEv2 and Switch Embedded Teaming over HA Mellanox Network Solution
- Understanding Traffic Class (TC) Scheduling on Mellanox Spectrum Switches (WRR,SP)
- Understanding RoCEv2 Congestion Management
- HowTo Configure PFC on ConnectX-4
pNIC Physical NIC, the physical hardware that exchanges packets with the TOR
vNIC Host vNIC – Virtual NIC from vSwitch exposed in the host partition
tNIC Host tNIC - Team Interface NIC from LBFO Team
vmNIC Virtual Machine NIC – Virtual NIC from vSwitch exposed in a guest partition
vSwitch Hyper-V virtual switch
SET Switch Embedded Teaming, Hyper-V virtual switch supported in Windows Server 2016 and 2019
ToR Top of Rack switch
RDMA Remote Direct Memory Access
RoCE RDMA over Converged Ethernet
RoCEv2 2nd generation RoCE using UDP/IP for routability (a.k.a. Routable RoCE)
DCB Data Center Bridging
LLDP Link Layer Data Protocol
DCBx Data Center Bridging Capability Exchange protocol (DCBX) is an extension of LLDP.
PFC Priority Flow Control
ETS Enhanced Transmission Service
TC Traffic Class
ECN Explicit Congestion Notification
RED Random Early Detection
CNP Congestion Notification Packet. CNP control frames (congestion ACK)
SP Strict Priority
WRR Weighted Round Robin
******************************************************************************
Mellanox Switch configuration example (L2):
******************************************************************************
Mellanox SX1012
enable
configure terminal
interface ethernet 1/1-1/12 flowcontrol send off force
interface ethernet 1/1-1/12 flowcontrol receive off force
dcb priority-flow-control enable force
dcb priority-flow-control priority 3 enable
(Priority 3 is used for Storage (SMB) traffic)
interface ethernet 1/1-1/12 dcb priority-flow-control mode on force
interface ethernet 1/1 switchport mode hybrid
(Need to be repeated for each port)
S2D# interface ethernet 1/x switchport hybrid allowed-vlan all
Need to be repeated for each port. Only allow the needed vlans, the "allowed-vlan all" is not recommended, this is from a lab/test system).
dcb ets tc bandwidth 49 50 0 1
(More infromation below)
exit
write memory
******************************************************************************
Mellanox Switch Disable PFC
******************************************************************************
no dcb priority-flow-control priority 3 enable
******************************************************************************
Mellanox SX1012 (SwitchX)
******************************************************************************
show dcb priority-flow-control detail
******************************************************************************
show dcb priority-flow-control
show dcb priority-flow-control interface ethernet 1/1
******************************************************************************
show dcb priority-flow-control interface ethernet 1/1
******************************************************************************
show dcb ets
******************************************************************************
******************************************************************************
For the PFC enabled priorities we need to use the lossless TCs
So for the Microsoft use cases we change the default to:
Priority 0 Default traffic 49%
Priority 3 SMB traffic 50%
Priority 7 Cluster traffic 1%
dcb ets tc bandwidth 49 50 0 1
******************************************************************************
show dcb ets interface ethernet 1/1
******************************************************************************
show interfaces ethernet 1/3 counters priority 3
******************************************************************************