Featured Post

YouTube and link library for S2D.dk

2021/04/30

Azure Stack HCI Troubleshooting "UseRdmaForStorage"

Azure Stack HCI Troubleshooting the Cluster Object "UseRdmaForStorage"


*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Last update: 2021.04.30

Get-HealthFault list a event with the Reason:
The cluster detected network connectivity issues that prevent Storage Spaces Direct from working properly.
To ensure consistent performance and data safety, Storage Spaces Direct has stopped using remote direct memory access (RDMA) even if RDMA-capable hardware is present and enabled.
Storage Spaces Direct will continue to flow but diminished performance using TCP/IP.

Azure Stack HCI (20H2)

Azure Stack HCI (20H2), have a new toggle switch called "UseRDMAForStorage" which gets flipped to 0 (Off) when Network issues are detected. The issue detection looks for SMB spontaneous disconnects and, if they occur often without an obvious explanation (e.g., the node restarting) then the Cluster stops to relying on RDMA/RoCE as a precaution. If you are confident that the network issue is fixed, you can flip the setting back to 1 (On).

***

When the Cluster disable RDMA and change to TCP...

Microsoft-Windows-FailoverClustering/Operational Event 5163:

  • Cluster service disabled RDMA on the SMB instance for SBL IO on this node. All IO for this instance will now go over TCP connections only.
  • Cluster service disabled RDMA on the SMB instance for CSV IO on this node. All IO for this instance will now go over TCP connections only.

When you enable RDMA again...

Microsoft-Windows-FailoverClustering/Operational Event 5164:

  • Cluster service enabled RDMA on the SMB instance for SBL IO on this node.
  • Cluster service enabled RDMA on the SMB instance for CSV IO on this node

***

Events that you will see in the minutes before the disabling of RDMA...

Microsoft-Windows-SMBClient/Connectivity Event 30804:

A network connection was disconnected.
Instance name: \Device\SmbVsa
Server name: x.x.x.x
Server address: x.x.x.x:445
Connection type: Rdma
InterfaceId: 
Guidance:
This indicates that the client's connection to the server was disconnected.
Frequent, unexpected disconnects when using an RDMA over Converged Ethernet (RoCE) adapter may indicate a network misconfiguration. RoCE requires Priority Flow Control (PFC) to be configured for every host, switch and router on the RoCE network. Failure to properly configure PFC will cause packet loss, frequent disconnects and poor performance.

***

Health Service:

If you get the Event RDMA is off and the Azure Stack HCI (20H2) used TCP/IP from Get-HealthFault

Review the Cluster Objects:
  • Get-Cluster | fl *
  • Cluster Object: UseRdmaForStorage (1=On) or (0=Off)
If the value is 0 review and validate the RDMA/RoCE and DCB settings with tools like:
  • "netstat -xan" shows the RDMA SMB connections
  • "Validate-DCB" (More information)
  • "Perfmon /sys" add counter for Network and RDMA
Change the Cluster Object:
  • (Get-Cluster).UseRdmaForStorage=1
Links:



2021/03/23

BifrostConnect - Remote access

BifrostConnect - Remote access

*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

BifrostConnect - Remote access


Demo of BifrostConnect remote console access to Cisco Switch with the use of RS232 Serial port over 4G

Remote access to Mac mini M1 with BifrostConnect over 4G See the boot process remotely for the Mac mini M1 Only one cable connected... for remote Keyboard, Video and Mouse (KVM access) Remote 4G, WiFi and LAN access to your Mac mini M1...


Remote access to Mac mini M1 with BifrostConnect over 4G and Ethernet

Change your Mac startup disk or access macOS Recovery Remote 1. With your Mac mini M1 (Apple silicon) turned off 2. Press and hold the power button until you see “Loading startup options”. 3. Select from the options...



Remote Access to Intel NUC BIOS with BifrostConnect... Change PC BIOS settings or Access Windows 10 Recovery Remotely... "Only one cable is need if USB-C is available on your PC, Keyboard, Video and Mouse (KVM access)" Today I will demo with the HDMI port for Video. Mouse and Keyboard is provided over the Micro-USB and we power the BifrostConnect with Power over Ethernet (PoE).


90 seconds review of BifrostConnect with 32 ports KVM Switch
Remote Access to Raritan 32 Port KVM Switch with BifrostConnect.
Use of VGA to HDMI adapter



BifrostConnect review with Desktop KVM from UGREEN Change a Desktop KVM to a advanced KVM over IP



BifrostConnect Demo of the Relay and Serial Port with Hikvision NVR

Remote Access Hikvision NVR and setup a Alarm input that can be triggered by the Relay port from BifrostConnect. Demo the "Computer" and "Terminal" access with HDMI pass-through Monitor from Feelworld.



BifrostConnect Unattended new feature demo (Beta Software) Demo with the new BifrostConnect Unattended version. Connect remote to Fujitsu TX1320 Server over the Serial port (RS232) to power on/off and remote KVM Access with the BifrostConnect. Demo BIOS and POST Access without network connection, but with the use of BifrostConnect with 4G. Demo of some of the BMC feature you can access over RS232 with the Fujitsu TX1320.



For more information: https://www.bifrostconnect.com https://www.bifrostconnect.com/bifrostconnect-unattended https://www.bifrostconnect.com/bifrostconnect-attended Security by design Reach your equipment via Bifrosts secure cloud, no file transfer and no software installation needed Connect to all types of devices Bifrost enables remote support to PC's, servers, mobile devices, network equipment, you name it.. Built-in 4G, Wifi and LAN with PoE Portable and battery powered Work with Bifrost for hours on-the-go without needing an external power supply. Charging with Micro-USB, USB-C or PoE. Support Micro-USB, USB-C, HDMI, RS232 Serial port. Even have a Relay Port...

*** Note: This video contains hardware and software provided by the manufacturers for the recording. The recording don't include paid product placements ***

2021/03/11

Windows Server 2019 and SR-IOV

Windows Server 2019 and Single Root Input/Output Virtualization (SR-IOV)

*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Last update: 2021.03.11

How to use single root input/output virtualization (SR-IOV) in Windows Server 2019 with Guest VMs running Windows Server 2012R2, Windows Server 2016 and Windows Server 2019


SR-IOV #3 - Upgrade Drivers for SR-IOV in Windows Server 2012R2 Guest VMs


SR-IOV #4 - Windows Server 2016 Guest VMs with SR-IOV (How fast is a VM using SR-IOV)


Links:
SR-IOV #5 - SR-IOV vs VMQ Performance

Demo with Mellanox ConnectX-5 (100 Gbps Network Adapter)


Demo with Chelsio (40Gbps Network Adapter)





The Windows Server 2019 Host Network configuration
The demo/lab Host configuration with three physical network interface controller (pNIC)
The use of three pNIC is only to show the diffrent amount of Virtual Functions (VF)
  1. Get-VMSwitch
  2. Get-VMSwitch | ft Name,NetAdapterInterfaceDescriptions
  3. Get-NetAdapter -Name *NIC* | sort name | ft name, InterfaceDescription, LinkSpeed -AutoSize
  4. Get-NetAdapterSriov | sort name | ft Name, InterfaceDescription, SriovSupport, NumVFs -AutoSize
  5. Note
  6. Get-NetAdapterSriovVf | sort name | ft -AutoSize
  7. Note





2020/12/10

Network Offload

 Network Offload

*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Network Offload


Azure Stack HCI


Windows Server 2019

Windows Server 2019 RSS Performance Demo with Default pNIC settings
Demo with Mellanox ConnectX-5 100Gbps and the use of Microsoft tool NTttcp.exe 


Windows Server 2019 RSS/MTU Performance Demo
MTU size changed from 1514 to 9014



Windows Server 2016

Windows Server 2016 RSS Performance Demo with Default pNIC settings
Demo with Mellanox ConnectX-5 100Gbps and the use of Microsoft tool NTttcp.exe 




Windows Server 2012R2

Windows Server 2012R2 Network Offload History lesson Demo from year 2014 with Windows Server 2012R2 and the use of Network Offload features (Synthetic Accelerations).



2020/11/01

Fault domain awareness

 Fault domain awareness in Microsoft Failover Cluster

*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Fault domain awareness in Microsoft Failover Cluster




PowerShell used in the Video

# Cluster and Node information
Get-ClusterNode | sort name | ft Name, State, StatusInformation, FaultDomain
Get-ClusterNode | sort name | ft Name, State, StatusInformation, FaultDomain | Out-File -FilePath C:\Temp\Review\Get-ClusterNode-ft.txt
Get-ClusterNode | sort name | fl *
Get-ClusterNode | sort name | fl * | Out-File -FilePath C:\Temp\Review\Get-ClusterNode-fl.txt

Start-Cluster
Stop-Cluster
Start-ClusterNode –ClearQuarantine
cls

# *************************************************************************************************************
# Fault domain awareness
# https://docs.microsoft.com/en-us/windows-server/failover-clustering/fault-domains
#
# Defining fault domains with PowerShell
# *************************************************************************************************************

Get-cluster | fl AutoAssignNodeSite

(Get-Cluster).AutoAssignNodeSite=1

Get-ClusterFaultDomain | sort name | ft -AutoSize
Get-ClusterFaultDomain | sort name | ft -AutoSize | Out-File -FilePath C:\Temp\Review\Get-ClusterFaultDomain-ft.txt
Get-ClusterFaultDomain | sort name | fl * 
Get-ClusterFaultDomain | sort name | fl * | Out-File -FilePath C:\Temp\Review\Get-ClusterFaultDomain-fl.txt

New-ClusterFaultDomain -Type Site -Name "Copenhagen" -Description "Microsoft Denmark"
New-ClusterFaultDomain -Type Rack -Name "RACK45"
New-ClusterFaultDomain -Type Rack -Name "RACK46"
New-ClusterFaultDomain -Type Rack -Name "RACK47"
New-ClusterFaultDomain -Type Rack -Name "RACK48"

Set-ClusterFaultDomain -Name "Copenhagen","RACK45", "RACK46", "RACK47", "RACK48" -Location "Bulding 92 Room 1"

Set-ClusterFaultDomain -Name "RACK45", "RACK46", "RACK47", "RACK48" -Parent "Copenhagen"

Set-ClusterFaultDomain -Name "S047011","S047012","S047013" -Parent "RACK47"
Set-ClusterFaultDomain -Name "S047014","S047015","S047016" -Parent "RACK48"

# *************************************************************************************************************
# Remove fault domains
# *************************************************************************************************************

Set-ClusterFaultDomain -Name "S047011","S047012","S047013","S047014","S047015","S047016" -Parent ""
Set-ClusterFaultDomain -Name "RACK45", "RACK46", "RACK47", "RACK48" -Parent ""

Remove-ClusterFaultDomain -Name "Copenhagen","RACK45", "RACK46", "RACK47", "RACK48"

(Get-Cluster).AutoAssignNodeSite=0

# *************************************************************************************************************

2020/08/25

DataON Azure Stack HCI - Public Preview

 Azure Stack HCI - Public Preview - Installation and Troubleshooting series with DataON


*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Azure Stack HCI - Public Preview - Installation and Troubleshooting series with DataON

DataON Azure Stack HCI
  • DataON Hosts
  • Intel NVMe
  • Mellanox Network

Part 1

Setup test Domain and Windows Admin Center
Installation of the physical DataON with the Azure Stack HCI - Public Preview
Configuration of network with one dual port Mellanox ConnectX-4 adapter for both Storage and Guest Traffic

Setup of VMs for Performance test with DiskSpd.exe
Se the impact on the Host CPUs for Storage, Network and Guest workload

(Video will come one day... when I have time to finish the editing)

Part 2

Configuration of network with two dual port Mellanox ConnectX-4 adapters. The Storage adapter are direct connected and the Guest Traffic use a SET switch connect to Mellanox SN2100 physical Switch


Notes and time agenda for the Video: 00:20 Agenda, Create one virtual switch for compute only and use direct connection for storage 00:45 Start the cleaning of the previous setup 00:50 Delete all the VMs (Was exported before the recording was started) 01:20 Delete all the vDisk 01:50 Disable-ClusterS2D the "destroy cluster" part is NOT included in this video... 02:10 Clear-Cluster Node was performed on all Nodes 02:44 Start the "Create new" "Server cluster" from WAC 02:56 WAC step 1.1 Check the prerequisites 03:05 WAC step 1.2 Add servers 04:35 WAC step 2.1 Verify network adapters (Remove Existing Switches, from last demo) 04:56 WAC step 2.2 Select management adapter (Use a 10Gbps pNIC, only 1Gbps connection in the Demo) 05:24 WAC step 2.3 Define networks (Add the Name, IP, Subnet and vlan for the Storage pNICs) 06:20 WAC step 2.4 Virtual Switch 07:00 WAC step 3.1 Validate cluster 07:30 WAC step 3.2 Create cluster 08:30 WAC step 4.1 Clean drives 09:18 WAC step 4.4 Enable Storage Spaces Direct More to come in the next videos about setup of vDisk and performance

Part 3

More to come

2020/08/02

Rebuild Storage Spaces Direct (S2D) SOFS Cluster

Rebuild Storage Spaces Direct S2D and Scale-Out File Server (SOFS) - Troubleshooting series

*** Disclaimer ***
s2d.dk is not responsible for any errors, or for the results obtained from the use of this information on s2d.dk. All information in this site is provided as "draft notes" and "as is", with no guarantee of completeness, accuracy, timeliness or of the results obtained from the use of this information. Always test in a lab setup, before use any of the information in production environment.
For any reference links to other websites we encourages you to read the privacy statements of the third-party websites.
The names of actual companies and products mentioned herein may be the trademarks of their respective owners.
***

Rebuild Storage Spaces Direct S2D and Scale-Out File Server (SOFS) - Troubleshooting series

Rebuild a 4 Node Storage Spaces Direct (S2D) SOFS Cluster. Reconfigured the Network and use Validate-DCB to ensure the RoCE setup of DCB, PFC, ETS. Enable the S2D again on the "new" cluster Reuse/add the "old" Storage Pool back to the "new" cluster Add the "old" vDisks and share it again with the SOFS Role If it work, I get my VMs back... no need to restore from Backup... The "old" Cluster was deleted and one Host was complet reinstalled. The "Clear-ClusterNode" was used to clear the cluster configuration from a node. This ensure that the failover cluster configuration has been completely removed from a node. Sit back, enjoy 1 hour and 12 minutes of planned and unplanned challenges. Troubleshooting with Validate-DCB and the use of SCVMM/PowerShell to configure the Cluster Network



If you don't have time to see it all (recommended to see it all, it is fun to see me get into trouble) Then I have created some time agenda, that can help to jump in the Video 0:03:08 See the old disk Pool info. on one Host from "Disk Management" 0:04:05 SCVMM Add the Host to SCVMM 0:05:05 SCVMM Add the Logical Switch to the Host 0:06:43 See the vNIC be created on the Host from SCVMM 0:08:14 See the vNIC for Storage 0:09:02 See the configuration of DCB, PFC and ETS (PowerShell) 0:13:56 The new cluster dont use the same vlan as the old, will give problems later 0:14:10 Use the wrong vNIC name when configure the MTU (See if Validate-DCB will detect the mistake 0:24:54) 0:16:44 See how I make a mistake. Add the wrong Subnet Mask for my Management vNIC... 0:21:59 Validate-DCB Installation of the Module 0:22:55 Validate-DCB Run the first time... 0:24:54 Validate-DCB shows the MTU Error 0:26:54 Validate-DCB on all Hosts (Changed the script in Notepad) 0:28:18 Validate-DCB Use the wrong Policy Name for the SMB 0:29:38 Validate-DCB NetQosPolicy Name changed 0:31:54 Validate-DCB ETS Traffic Class missing 0:34:00 Validate-DCB Okay 0:37:05 Found the Subnet error on the Management vNIC 0:40:01 Cluster wizard failed first time 0:41:32 Cluster wizard failed second time 0:42:02 Cluster Wizard failed again 0:42:56 Cluster Wizard started again, this time it works... 0:45:07 Cluster created (Back in the game) 0:46:10 Enable Storage Spaces Direct S2D again (cross the finger that is accept the old pool) 0:48:08 Add the "old" Storage Pool back to the "new" Cluster 0:48:26 Add the "old" Disk back to the "new" Cluster (vDisk 1 to 3) 0:49:21 Rename vDisks after add to the cluster 0:51:14 Add SOFS03 Role 0:55:42 SOFS Role created for "SOFS03" 0:56:09 Delegation of Control to the CNO 0:56:53 Add File Share for vDisk2 0:58:00 Add File Share for vDisk2 0:59:10 Add File Share for vDisk3 1:01:35 SCVMM Remove the old SOFS Provider from SCVMM 1:01:43 SCVMM Add the rebuild SOFS 1:05:04 SCVMM Add the vDisk Access Control 1:06:31 SCVMM Missing the "File Share managed by Virtual Machine Manager" 1:07:02 SCVMM Add the File Share Storage vDisk1 1:08:58 SCVMM Repair the vDisk2 File Share Storage connection 1:09:06 SCVMM Add the File Share Storage vDisk3 1:09:28 SCVMM File Share Storage all vDisks are now green 1:09:50 Explain the last problems (vlan and SOFS Client access) 1:11:00 Start the first VM *** Links: DCB, PFC and ETS Configuration for RDMA/RoCE.
https://www.s2d.dk/2019/12/dcb-configuration.html Validator for RDMA Configuration and Best Practices

Microsoft Docs version of the Blog is now released Validate an Azure Stack HCI cluster https://docs.microsoft.com/en-us/azure-stack/hci/deploy/validate Validate-DCB Disconnected installation