How to change the OS on an existing Service Fabric cluster?

扶醉桌前 提交于 2020-07-22 10:14:40

问题


I'm trying to change my VMSS from:

    "imageReference": {
      "publisher": "MicrosoftWindowsServer",
      "offer": "WindowsServer",
      "sku": "2016-Datacenter-with-Containers",
      "version": "latest"
    }

To:

    "imageReference": {
      "publisher": "MicrosoftWindowsServer",
      "offer": "WindowsServerSemiAnnual",
      "sku": "Datacenter-Core-1803-with-Containers-smalldisk",
      "version": "latest"
    }

The first thing I tried was:

Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku Datacenter-Core-1803-with-Containers-smalldisk -ImageReferenceOffer WindowsServerSemiAnnual

Which gives me the error:

Update-AzureRmVmss : Changing property 'imageReference.offer' is not allowed. ErrorCode: PropertyChangeNotAllowed

This is confirmed in the docs; you can only set the offer when the scaleset is created.

Next I tried Add-AzureRmServiceFabricNodeType to add a new node type, thinking I could just delete the old one after. However, this command doesn't seem to allow you to set the OS image. You can only set the VM SKU (In other words, all VMs on your cluster have to have the same OS).

Is there a way to change this without deleting the entire cluster and starting from scratch?


回答1:


Edit If you can stay within the current publisher+offer, you can switch the OS very easily by simply changing the SKU. See the answer by Mike.


If you really need to change the offer, you can do this:

Upgrade the size and operating system of the primary node type VMs.

Be aware that you need to consider a lot of things like your availablility level. The cluster will also be unavailable from the outside for a time.

Shortened drastically:

  • add a second scale set with your desired OS to the primary node type
  • disable the old scale set, then remove it
  • switch over the load balancer
# Variables.
$groupname = "sfupgradetestgroup"
$clusterloc="southcentralus"  
$subscriptionID="<your subscription ID>"

# sign in to your Azure account and select your subscription
Login-AzAccount -SubscriptionId $subscriptionID 

# Create a new resource group for your deployment and give it a name and a location.
New-AzResourceGroup -Name $groupname -Location $clusterloc

# Deploy the two node type cluster.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.parameters.json" `
    -TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.json" -Verbose

# Connect to the cluster and check the cluster health.
$ClusterName= "sfupgradetest.southcentralus.cloudapp.azure.com:19000"
$thumb="F361720F4BD5449F6F083DDE99DC51A86985B25B"

Connect-ServiceFabricCluster -ConnectionEndpoint $ClusterName -KeepAliveIntervalInSec 10 `
    -X509Credential `
    -ServerCertThumbprint $thumb  `
    -FindType FindByThumbprint `
    -FindValue $thumb `
    -StoreLocation CurrentUser `
    -StoreName My 

Get-ServiceFabricClusterHealth

# Deploy a new scale set into the primary node type.  Create a new load balancer and public IP address for the new scale set.
New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.parameters.json" `
    -TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.json" -Verbose

# Check the cluster health again. All 15 nodes should be healthy.
Get-ServiceFabricClusterHealth

# Disable the nodes in the original scale set.
$nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")

Write-Host "Disabling nodes..."
foreach($name in $nodeNames){
    Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
}

Write-Host "Checking node status..."
foreach($name in $nodeNames){

    $state = Get-ServiceFabricNode -NodeName $name 

    $loopTimeout = 50

    do{
        Start-Sleep 5
        $loopTimeout -= 1
        $state = Get-ServiceFabricNode -NodeName $name
        Write-Host "$name state: " $state.NodeDeactivationInfo.Status
    }

    while (($state.NodeDeactivationInfo.Status -ne "Completed") -and ($loopTimeout -ne 0))


    if ($state.NodeStatus -ne [System.Fabric.Query.NodeStatus]::Disabled)
    {
        Write-Error "$name node deactivation failed with state" $state.NodeStatus
        exit
    }
}

# Remove the scale set
$scaleSetName="NTvm1"
Remove-AzVmss -ResourceGroupName $groupname -VMScaleSetName $scaleSetName -Force
Write-Host "Removed scale set $scaleSetName"

$lbname="LB-sfupgradetest-NTvm1"
$oldPublicIpName="PublicIP-LB-FE-0"
$newPublicIpName="PublicIP-LB-FE-2"

# Store DNS settings of public IP address related to old Primary NodeType into variable 
$oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName  -ResourceGroupName $groupname

$primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel

$primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn

# Remove Load Balancer related to old Primary NodeType. This will cause a brief period of downtime for the cluster
Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force

# Remove the old public IP
Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force

# Replace DNS settings of Public IP address related to new Primary Node Type with DNS settings of Public IP address related to old Primary Node Type
$PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName  -ResourceGroupName $groupname
$PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
$PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
Set-AzPublicIpAddress -PublicIpAddress $PublicIP

# Check the cluster health
Get-ServiceFabricClusterHealth

# Remove node state for the deleted nodes.
foreach($name in $nodeNames){
    # Remove the node from the cluster
    Remove-ServiceFabricNodeState -NodeName $name -TimeoutSec 300 -Force
    Write-Host "Removed node state for node $name"
}




回答2:


Here's another (simpler) answer for those who want to switch to another OS, but can switch to an OS image in the same publisher/Offer. You can get the list of available OS SKUs with the following command:

Get-AzureRmVMImageSku -Location 'westus2' -PublisherName MicrosoftWindowsServer -Offer WindowsServer

Then, you can upgrade your cluster to use that image with:

Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku 2019-Datacenter-Core-with-Containers-smalldisk

The command will take an hour or more to run.

I've also run into a few SKUs that error out with an "Image Not Found" error, even though they appear on the list. Not sure what the cause of that is. However, in this case I was able to find out that works for me.



来源:https://stackoverflow.com/questions/56315903/how-to-change-the-os-on-an-existing-service-fabric-cluster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!