Using cloud-init on an Azure VM to mount a data disk fails

随声附和 提交于 2020-07-22 05:27:05

问题


This is a similar problem to a previous SO question, from which I adapted my code How can i use cloud-init to load a datadisk on an ubuntu VM in azure

Using a cloud-config file passed through Terraform:

#cloud-config
disk_setup:
  /dev/disk/azure/scsi1/lun0:
    table_type: gpt
    layout: true
    overwrite: false

fs_setup:
  - device: /dev/disk/azure/scsi1/lun0
    partition: 1
    filesystem: ext4

mounts:
  - [
      "/dev/disk/azure/scsi1/lun0-part1",
      "/opt/data",
      auto,
      "defaults,noexec,nofail",
    ]
data "template_file" "cloudconfig" {
  template = file("${path.module}/cloud-init.tpl")
}

data "template_cloudinit_config" "config" {
  gzip          = true
  base64_encode = true

  part {
    content_type = "text/cloud-config"
    content      = "${data.template_file.cloudconfig.rendered}"
  }
}

module "nexus_test_vm" {
  #unnecessary details ommitted - 1 VM with 1 external disk, fixed lun of 0, ubuntu 18.04
  vm_size            = "Standard_B2S"

  cloud_init_template = data.template_cloudinit_config.config.rendered
}

Relevant bit of the module (VM creation)

resource "azurerm_virtual_machine" "generic-vm" {
  count               = var.number
  name                = "${local.my_name}-${count.index}-vm"
  location            = var.location
  resource_group_name = var.resource_group_name

  network_interface_ids         = [azurerm_network_interface.generic-nic[count.index].id]
  vm_size                       = var.vm_size
  delete_os_disk_on_termination = true

  storage_image_reference {
    id = var.image_id
  }

  storage_os_disk {
    name              = "${local.my_name}-${count.index}-os"
    caching           = "ReadWrite"
    create_option     = "FromImage"
    managed_disk_type = "Standard_LRS"
    disk_size_gb      = var.os_disk_size
  }

  os_profile {
    computer_name  = "${local.my_name}-${count.index}"
    admin_username = local.my_admin_user_name
    custom_data    = var.cloud_init_template
  }

  os_profile_linux_config {
    disable_password_authentication = true

    ssh_keys {
      path = "/home/${local.my_admin_user_name}/.ssh/authorized_keys"
      //key_data = tls_private_key.vm_ssh_key.public_key_openssh
      key_data = var.public_key_openssh
    }
  }

  tags = {
    Name        = "${local.my_name}-${count.index}"
    Deployment  = local.my_deployment
    Prefix      = var.prefix
    Environment = var.env
    Location    = var.location
    Volatile    = var.volatile
    Terraform   = "true"
  }
}

resource "azurerm_managed_disk" "generic-disk" {
  name                 = "${azurerm_virtual_machine.generic-vm.*.name[0]}-1-generic-disk"
  location             = var.rg_location
  resource_group_name  = var.rg_name
  storage_account_type = "Standard_LRS"
  create_option        = "Empty"
  disk_size_gb         = var.external_disk_size
}

resource "azurerm_virtual_machine_data_disk_attachment" "generic-disk" {
  managed_disk_id    = azurerm_managed_disk.generic-disk.id
  virtual_machine_id = azurerm_virtual_machine.generic-vm.*.id[0]
  lun                = 0
  caching            = "ReadWrite"
}

I am getting a lot of weird errors indicating that the disk does not exist when cloud-init is running. However, when I ssh into the VM, the disk is right there! Is this a race condition? Is there a wait I can configure in cloud-init or something to give me a better picture of what might be happening?

Relevant logs from the VM:

head -n 5000 /var/log/cloud-init.log | grep lun

2020-04-07 16:30:51,296 - cc_disk_setup.py[DEBUG]: Partitioning disks: {'/dev/disk/azure/scsi1/lun0': {'layout': True, 'overwrite': False, 'table_type': 'gpt'}, '/dev/disk/cloud/azure_resource': {'table_type': 'gpt', 'layout': [100], 'overwrite': True, '_origname': 'ephemeral0'}}
2020-04-07 16:30:51,318 - util.py[DEBUG]: Creating partition on /dev/disk/azure/scsi1/lun0 took 0.021 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,601 - cc_disk_setup.py[DEBUG]: setting up filesystems: [{'device': '/dev/disk/azure/scsi1/lun0', 'filesystem': 'ext4', 'partition': 1}]
2020-04-07 16:30:51,725 - util.py[DEBUG]: Creating fs for /dev/disk/azure/scsi1/lun0 took 0.124 seconds
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
RuntimeError: Device /dev/disk/azure/scsi1/lun0 did not exist and was not created with a udevadm settle.
2020-04-07 16:30:51,733 - cc_mounts.py[DEBUG]: mounts configuration is [['/dev/disk/azure/scsi1/lun0-part1', '/opt/data', 'auto', 'defaults,noexec,nofail']]
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Attempting to determine the real name of /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: changed /dev/disk/azure/scsi1/lun0-part1 => None
2020-04-07 16:30:51,734 - cc_mounts.py[DEBUG]: Ignoring nonexistent named mount /dev/disk/azure/scsi1/lun0-part1
2020-04-07 16:30:51,736 - cc_mounts.py[DEBUG]: Changes to fstab: ['+ /dev/disk/azure/scsi1/lun0-part1 /opt/data auto defaults,noexec,nofail,comment=cloudconfig 0 2']

ls -l /dev/disk/azure/scsi1/lun0

lrwxrwxrwx 1 root root 12 Apr  7 16:32 /dev/disk/azure/scsi1/lun0 -> ../../../sdc

回答1:


For this issue, I think it's the sequence about the data disk and the VM and the cloud-init. As I know, the cloud-init is executed when the VM first boot. And the Terraform file you created seems that the data disk may be created later than the VM, so it also is later than then cloud-init and then it caused the error.

So the solution is that set the data disk inside the VM with the storage_data_disk block so that the VM will be created with the data disk attached and then execute the cloud-init.



来源:https://stackoverflow.com/questions/61085490/using-cloud-init-on-an-azure-vm-to-mount-a-data-disk-fails

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!