OpenShift Virtualization in AWS

So a project I work on which is offering OpenShift-as-a-Service called OpenShift Partner Labs has got a lot of request for clusters with OpenShift Virtualization. Most of these clusters are single node baremetal instances running in AWS and until recently were working pretty much without any challenges but not so much lately.

One issue we are running into is EBS storage in AWS is reaaaaaalllyyyy slow. I’ve seen this issue before but not related to partners who wanted OpenShift Virtualization. In order to get around EBS speed issues for a few more coins we use the c5d.metal instance which offers some physical storage attached to the instance.

I’m going to document how we set up the c5d.metal instance with NFS and LVM Operator so OpenShift Virtualization is much quicker in terms of bringing up VMs and has the ability to do ReadWriteMany with NFS.

So, when you deploy the cluster you already have NFS; yay :)

But it’s not active; boo :(

[root@ip-10-0-23-93 ~]# systemctl status nfs-server
○ nfs-server.service - NFS server and services
     Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; disabled; preset: disabled)
     Active: inactive (dead)
       Docs: man:rpc.nfsd(8)
             man:exportfs(8)
[root@ip-10-0-23-93 ~]#

We need to get NFS up and running and here are the commands to do that:

[root@ip-10-0-23-93 ~]# lsblk # check for the available disks
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme4n1     259:0    0   300G  0 disk
├─nvme4n1p1 259:2    0     1M  0 part
├─nvme4n1p2 259:3    0   127M  0 part
├─nvme4n1p3 259:4    0   384M  0 part /boot
└─nvme4n1p4 259:5    0 299.5G  0 part /var/lib/kubelet/pods/29f507c2-4b91-4db2-a68b-0a2f4694c6ed/volume-subpaths/nginx-conf/networking-console-plugin/1
                                      /var
                                      /sysroot/ostree/deploy/rhcos/var
                                      /sysroot
                                      /usr
                                      /etc
                                      /
nvme0n1     259:1    0 838.2G  0 disk
nvme1n1     259:6    0 838.2G  0 disk
nvme2n1     259:7    0 838.2G  0 disk
nvme3n1     259:8    0 838.2G  0 disk
[root@ip-10-0-23-93 ~]# fdisk /dev/nvme0n1 # use first available disk and prep for NFS

Welcome to fdisk (util-linux 2.37.4).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Device does not contain a recognized partition table.
Created a new DOS disklabel with disk identifier 0x8b9a535c.

Command (m for help): n
Partition type
   p   primary (0 primary, 0 extended, 4 free)
   e   extended (container for logical partitions)
Select (default p):

Using default response p.
Partition number (1-4, default 1):
First sector (2048-1757812499, default 2048):
Last sector, +/-sectors or +/-size{K,M,G,T,P} (2048-1757812499, default 1757812499):

Created a new partition 1 of type 'Linux' and of size 838.2 GiB.

Command (m for help): w
The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.

[root@ip-10-0-23-93 ~]# lsblk # we now have a partition for our NFS
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme4n1     259:0    0   300G  0 disk
├─nvme4n1p1 259:2    0     1M  0 part
├─nvme4n1p2 259:3    0   127M  0 part
├─nvme4n1p3 259:4    0   384M  0 part /boot
└─nvme4n1p4 259:5    0 299.5G  0 part /var/lib/kubelet/pods/29f507c2-4b91-4db2-a68b-0a2f4694c6ed/volume-subpaths/nginx-conf/networking-console-plugin/1
                                      /var
                                      /sysroot/ostree/deploy/rhcos/var
                                      /sysroot
                                      /usr
                                      /etc
                                      /
nvme0n1     259:1    0 838.2G  0 disk
└─nvme0n1p1 259:9    0 838.2G  0 part
nvme1n1     259:6    0 838.2G  0 disk
nvme2n1     259:7    0 838.2G  0 disk
nvme3n1     259:8    0 838.2G  0 disk
[root@ip-10-0-23-93 ~]# mkdir -p /var/exports/openshift # create our mount point in /var because of rwx caveats with OS
[root@ip-10-0-23-93 ~]# mkfs.xfs /dev/nvme0n1p1
meta-data=/dev/nvme0n1p1         isize=512    agcount=4, agsize=54931577 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=0
data     =                       bsize=4096   blocks=219726306, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=107288, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
[root@ip-10-0-23-93 ~]# mount /dev/nvme0n1p1 /var/exports/openshift
[root@ip-10-0-23-93 ~]# chmod 755 /var/exports/openshift
[root@ip-10-0-23-93 ~]# chown -R nfsnobody: /var/exports/openshift
[root@ip-10-0-23-93 ~]# semanage fcontext --add --type nfs_t "/var/exports/openshift(/.*)?"
[root@ip-10-0-23-93 ~]# restorecon -R -v /var/exports/openshift
Relabeled /var/exports/openshift from system_u:object_r:unlabeled_t:s0 to system_u:object_r:nfs_t:s0
[root@ip-10-0-23-93 ~]# echo "/var/exports/openshift *(insecure,no_root_squash,async,rw)" >> /etc/exports
[root@ip-10-0-23-93 ~]# cat /etc/exports
/var/exports/openshift *(insecure,no_root_squash,async,rw)
[root@ip-10-0-23-93 ~]# systemctl enable --now nfs-server
Created symlink /etc/systemd/system/multi-user.target.wants/nfs-server.service → /usr/lib/systemd/system/nfs-server.service.
[root@ip-10-0-23-93 ~]# systemctl status nfs-server
● nfs-server.service - NFS server and services
     Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; preset: disabled)
    Drop-In: /run/systemd/generator/nfs-server.service.d
             └─order-with-mounts.conf
     Active: active (exited) since Wed 2025-07-09 21:43:12 UTC; 6s ago
       Docs: man:rpc.nfsd(8)
             man:exportfs(8)
    Process: 113652 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
    Process: 113653 ExecStart=/usr/sbin/rpc.nfsd (code=exited, status=0/SUCCESS)
    Process: 113668 ExecStart=/bin/sh -c if systemctl -q is-active gssproxy; then systemctl reload gssproxy ; fi (code=exited, status=0/SUCCESS)
   Main PID: 113668 (code=exited, status=0/SUCCESS)
        CPU: 20ms

Jul 09 21:43:12 ip-10-0-23-93 systemd[1]: Starting NFS server and services...
Jul 09 21:43:12 ip-10-0-23-93 systemd[1]: Finished NFS server and services.
[root@ip-10-0-23-93 ~]# exportfs -av
exporting *:/var/exports/openshift
[root@ip-10-0-23-93 ~]# showmount -e localhost
Export list for localhost:
/var/exports/openshift *
[root@ip-10-0-23-93 ~]# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
nvme4n1     259:0    0   300G  0 disk
├─nvme4n1p1 259:2    0     1M  0 part
├─nvme4n1p2 259:3    0   127M  0 part
├─nvme4n1p3 259:4    0   384M  0 part /boot
└─nvme4n1p4 259:5    0 299.5G  0 part /var/lib/kubelet/pods/29f507c2-4b91-4db2-a68b-0a2f4694c6ed/volume-subpaths/nginx-conf/networking-console-plugin/1
                                      /var
                                      /sysroot/ostree/deploy/rhcos/var
                                      /sysroot
                                      /usr
                                      /etc
                                      /
nvme0n1     259:1    0 838.2G  0 disk
└─nvme0n1p1 259:9    0 838.2G  0 part /var/exports/openshift
nvme1n1     259:6    0 838.2G  0 disk
nvme2n1     259:7    0 838.2G  0 disk
nvme3n1     259:8    0 838.2G  0 disk
timed out waiting for input: auto-logout
timed out waiting for input: auto-logout
sh-5.1# exit
exit

Removing debug pod ...

Alright so at this point we have our node setup for NFS and we are ready to add the NFS CSI driver to our cluster so we can create PVCs that will be bound to PVs in the NFS.

Make sure you are targeting the appropriate cluster. The easiest way to do that is to run `oc cluster-info` and verify the output shows the expected cluster. Additionally you need to have helm installed which you can do from their website https://helm.sh

# Add the csi-driver-nfs repo to your local helm
❯ helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts

# Verify that the repo's contents are visible
❯ helm search repo -l csi-driver-nfs
NAME                            CHART VERSION   APP VERSION     DESCRIPTION
csi-driver-nfs/csi-driver-nfs   4.11.0          4.11.0          CSI NFS Driver for Kubernetes
csi-driver-nfs/csi-driver-nfs   v4.10.0         v4.10.0         CSI NFS Driver for Kubernetes
csi-driver-nfs/csi-driver-nfs   v4.9.0          v4.9.0          CSI NFS Driver for Kubernetes
csi-driver-nfs/csi-driver-nfs   v4.8.0          v4.8.0          CSI NFS Driver for Kubernetes
...

# Perform the install - I disable all snapshot support because we don't need it
❯ helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --version 4.11.0 --create-namespace --namespace csi-driver-nfs --set controller.runOnControlPlane=true --set externalSnapshotter.enabled=false --set externalSnapshotter.customResourceDefinitions.enabled=false
W0709 16:52:04.134356  119839 warnings.go:70] would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true), privileged (container "nfs" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (containers "liveness-probe", "node-driver-registrar", "nfs" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nfs" must not include "SYS_ADMIN" in securityContext.capabilities.add), restricted volume types (volumes "socket-dir", "pods-mount-dir", "registration-dir" use restricted volume type "hostPath"), runAsNonRoot != true (pod or containers "liveness-probe", "node-driver-registrar", "nfs" must set securityContext.runAsNonRoot=true)
W0709 16:52:04.195590  119839 warnings.go:70] would violate PodSecurity "restricted:latest": host namespaces (hostNetwork=true), privileged (container "nfs" must not set securityContext.privileged=true), allowPrivilegeEscalation != false (containers "csi-provisioner", "csi-resizer", "csi-snapshotter", "liveness-probe", "nfs" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nfs" must not include "SYS_ADMIN" in securityContext.capabilities.add), restricted volume types (volume "pods-mount-dir" uses restricted volume type "hostPath"), runAsNonRoot != true (pod or containers "csi-provisioner", "csi-resizer", "csi-snapshotter", "liveness-probe", "nfs" must set securityContext.runAsNonRoot=true)
NAME: csi-driver-nfs
LAST DEPLOYED: Wed Jul  9 16:52:02 2025
NAMESPACE: csi-driver-nfs
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The CSI NFS Driver is getting deployed to your cluster.

To check CSI NFS Driver pods status, please run:

  kubectl --namespace=csi-driver-nfs get pods --selector="app.kubernetes.io/instance=csi-driver-nfs" --watch

# Update the serviceaccounts permissions
❯ oc adm policy add-scc-to-user privileged -z csi-nfs-node-sa -n csi-driver-nfs
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "csi-nfs-node-sa"
❯ oc adm policy add-scc-to-user privileged -z csi-nfs-controller-sa -n csi-driver-nfs
clusterrole.rbac.authorization.k8s.io/system:openshift:scc:privileged added: "csi-nfs-controller-sa"

Alright, now we need a storage class and a PVC to verify a PV is created using our NFS setup:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nfs-csi
  annotations:
    description: Provides RWX Filesystem & Block volumes
provisioner: nfs.csi.k8s.io
parameters:
  server: 127.0.0.1
  share: /var/exports/openshift
  subDir: '${pvc.metadata.namespace}-${pvc.metadata.name}-${pv.metadata.name}'
reclaimPolicy: Delete
allowVolumeExpansion: true
volumeBindingMode: Immediate

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: nfs-ocpv-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: nfs-csi
  volumeMode: Filesystem

Setting up LVM Operator is really straight forward. We first install the operator from the OperatorHub and then we create a LVMCluster resource. We add all of the storage disks and those that already have filesystems will be skipped by the operator. In order to ensure we avoid issues with disk names or attributes we might typically use to reference the disks we will use /dev/disk/by-path :

[root@ip-10-0-23-93 /]# ls -lha /dev/disk/by-path/
total 0
drwxr-xr-x. 2 root root 220 Jul 29 01:12 .
drwxr-xr-x. 9 root root 180 Jul 29 01:12 ..
lrwxrwxrwx. 1 root root  13 Jul 29 01:12 pci-0000:18:00.0-nvme-1 -> ../../nvme0n1
lrwxrwxrwx. 1 root root  13 Jul 29 01:12 pci-0000:18:00.1-nvme-1 -> ../../nvme1n1
lrwxrwxrwx. 1 root root  13 Jul 29 01:12 pci-0000:90:00.0-nvme-1 -> ../../nvme3n1
lrwxrwxrwx. 1 root root  15 Jul 29 01:12 pci-0000:90:00.0-nvme-1-part1 -> ../../nvme3n1p1
lrwxrwxrwx. 1 root root  15 Jul 29 01:12 pci-0000:90:00.0-nvme-1-part2 -> ../../nvme3n1p2
lrwxrwxrwx. 1 root root  15 Jul 29 01:12 pci-0000:90:00.0-nvme-1-part3 -> ../../nvme3n1p3
lrwxrwxrwx. 1 root root  15 Jul 29 01:12 pci-0000:90:00.0-nvme-1-part4 -> ../../nvme3n1p4
lrwxrwxrwx. 1 root root  13 Jul 29 01:12 pci-0000:e7:00.0-nvme-1 -> ../../nvme4n1
lrwxrwxrwx. 1 root root  13 Jul 29 01:12 pci-0000:e7:00.1-nvme-1 -> ../../nvme2n1
[root@ip-10-0-23-93 /]#

Once we have the paths of the disks we then want to create our LVMCluster resource and add them:

apiVersion: lvm.topolvm.io/v1alpha1
kind: LVMCluster
metadata:
  name: lvmstor
  namespace: openshift-storage
spec:
  storage:
    deviceClasses:
      - default: true
        deviceSelector:
          optionalPaths:
            - /dev/disk/by-path/pci-0000:18:00.0-nvme-1
            - /dev/disk/by-path/pci-0000:18:00.1-nvme-1
            - /dev/disk/by-path/pci-0000:e7:00.1-nvme-1
            - /dev/disk/by-path/pci-0000:90:00.0-nvme-1
            - /dev/disk/by-path/pci-0000:e7:00.0-nvme-1
        fstype: xfs
        name: vg0

We add all the disks using the optionalPaths key because if for any reason the cluster node is restarted it is possible the disk that was previously /dev/nvme0n1 is now named differently. So adding all the disks means initially if any disks already have a filesystem then the operator will ignore it setting up the other disks for LVM. When/if the node is restarted and the names change the disks will still have the right signatures and the operator will appropriately get the disks it is supposed to.

At this point we have two storage classes - nfs-csi and lvms-vg0 - offering RWX and faster storage for VMs