Proxmox VE 7 replace zfs boot disk

Mon, May 2, 2022 4-minute read

Introduction

Everything dies, even enterprise hardware.

This is why having a failover is a good thing.

I am running my Proxmox VE 7 servers with a mirrored ZFS root pool, so I can protect myself against a single drive dying and taking down a proxmox server.

Today I received 8 SATADOM’s that I wanted to use for my boot drives intead of my SATA disks that were tiny and slow.

So I had to dig up the proper way to replace the drives.

Proxmox partition schema for a boot disk is:

Partition 1 = BIOS Boot
Partition 2 = EFI Boot
Partition 3 = ZFS

Steps

Install the new boot drive(s) into the server. If you are lucky you have hot-plug drives and don’t need to power down the server.

For simplicity’s sake I will use this example of hardware and zfs pool.

Old disks are:

/dev/sda & /dev/sdb

New disks are:

/dev/sdc & /dev/sdd

The root pool looks like this:

  pool: rpool
 state: ONLINE
  scan: resilvered 1.72G in 00:00:28 with 0 errors on Mon May  2 17:28:09 2022
config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            sda-part3  ONLINE       0     0     0
            sdb-part3  ONLINE       0     0     0

The names are simplified - in reality they would be called something similar to:ata-TOSHIBA_THNSN8960PCSE_26MS10GLTB1V-part3

So what needs to be done is simply - replace a disk one by one, waiting for the resilver process to complete and then initialize the disk so it can be booted from.

With that in mind, this is the process - we want to replace sda with sdc and sdb with sdd

Partitions & ZFS

sgdisk /dev/sda -R /dev/sdc
sgdisk -G /dev/sdc
zpool replace -f rpool sda-part3 /dev/disk/by-id/sdc-part3

The above steps copies the partition table from sda -> sdc and initializes new guids for the partitions, then it replaces sda-part3 with sdc-part3 into the zfs pool rpool.

When the last command has been entered then zfs will start to resilver, which means basically copy data from the old disk to the new disk.

You can check the status of the resilver process by entering

zpool status -v rpool

This command will output some stats about the resilver speed and the percentage its done.

Proxmox boot refreshing

When the resilver process is done then the proxmox environment needs to be installed on the EFI partition.

This is done via:

proxmox-boot-tool format /dev/sdc2
proxmox-boot-tool init /dev/sdc2

If you want to be 100% sure that everything is okay with the new disk, you can run:

proxmox-boot-tool refresh

This refreshes the boot environments on all EFI/BIOS boot partitions in the system. At this stage it will refresh sda, sdb & sdc - since all 3 disks are bootable at this stage.

If you only had to replace one disk you can stop here and congratulate yourself on having paid for the insurance of being able to replace a failed boot drive.

If you want to replace the next drive, you simply repeat the process and just replace sda with sdb and sdc with sdd.

When both drives has been placed a zpool status will show this:

config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            sdc-part3  ONLINE       0     0     0
            sdd-part3  ONLINE       0     0     0

Expanding the size of the root partitions

If you like me replaced the boot drives with higher capacity disks, then you could consider expanding the zfs partition, so proxmox gets a little more disk space on the root partition.

This is done in multiple steps.

First ensure that you have parted installed - if not then install it by running apt-get install parted

# resize partition 3 of sdc to use 50% of the available space  (partition 3 is the ZFS partition)
parted /dev/sdc resizepart 3 50%

# expand zfs on sdc to use the entire expanded partition
zpool online -e rpool /dev/disk/by-id/sdc-part3

# resize partition 3 of sdd to use 50% of the available space (partition 3 is the ZFS partition)
parted /dev/sdd resizepart 3 50%

# expand zfs on sdd to use the entire expanded partition
zpool online -e rpool /dev/disk/by-id/sdd-part3

I the above example I have expanded the partition to 50% of the available size.

This is called over provisioning - which basically means that the SSD controller have more room to reallocate failed sectors to cells not failed.

Which in turn mean your disk will last longer.

I also did this on my own servers, since the boot drive is not being used for much - certainly not storing virtual machines if you know what you are doing - and then you do not require a lot of space, so what is important is that the drives last long - and this where under provisioning can help.