We would like to be able to create a LocalDisk object using a
symlink. The main benefits are: 1. It is a lot easier for a sysadmin to
know which disk to use by referencing something like
/dev/disk/by-id/wwn-0x5000c5009bae0638
as opposed to
/dev/sdf
. The latter name might change across systems,
which makes things less streamlined. 2. It is a lot easier to automate
out of the box
In my case we have an AWS cluster with three masters and three workers. The three workers have a EBS volume that is attached to each of them (shared lun). I followed https://www.ibm.com/docs/en/scalecontainernative/5.2.2?topic=systems-local-file-system#specify-device-names so my cluster object looks like the following:
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: Cluster
metadata:
name: ibm-spectrum-scale
namespace: ibm-spectrum-scale
spec:
pmcollector:
nodeSelector:
scale.spectrum.ibm.com/daemon-selector: ""
daemon:
nsdDevicesConfig:
localDevicePaths:
- devicePath: /dev/disk/by-id/nvme*
deviceType: generic
clusterProfile:
controlSetxattrImmutableSELinux: "yes"
enforceFilesetQuotaOnRoot: "yes"
ignorePrefetchLUNCount: "yes"
initPrefetchBuffers: "128"
maxblocksize: 16M
prefetchPct: "25"
prefetchTimeout: "30"
nodeSelector:
scale.spectrum.ibm.com/daemon-selector: ""
roles:
- name: client
resources:
cpu: "2"
memory: 4Gi
- name: storage
resources:
cpu: "2"
memory: 8Gi
license:
accept: true
license: data-management
networkPolicy: {}
In a default installation (three worker nodes with a single shared lun referenced by a symlink) with the following LocalDisk:
apiVersion: scale.spectrum.ibm.com/v1beta1
kind: LocalDisk
metadata:
name: shareddisk1
namespace: ibm-spectrum-scale
spec:
device: "/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615"
node: ip-10-0-54-148.eu-central-1.compute.internal
nodeConnectionSelector:
matchExpressions:
- key: node-role.kubernetes.io/worker
operator: Exists
existingDataSkipVerify: true
The device exists inside the main gpfs pods:
DEVICE=$(oc get localdisk -n ibm-spectrum-scale shareddisk1 -o jsonpath="{.spec.device}")
NODE=$(oc get nodes -l node-role.kubernetes.io/worker -o jsonpath="{range .items[0]}{.metadata.name}{'\n'}")
oc debug node/${NODE} -- chroot /host sh -c "ls -l ${DEVICE}"
Outputs this:
++ oc get localdisk -n ibm-spectrum-scale shareddisk1 -o 'jsonpath={.spec.device}'
+ DEVICE=/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615
++ oc get nodes -l node-role.kubernetes.io/worker -o 'jsonpath={range .items[0]}{.metadata.name}{'\''\n'\''}'
+ NODE=ip-10-0-54-148.eu-central-1.compute.internal
+ oc debug node/ip-10-0-54-148.eu-central-1.compute.internal -- chroot /host sh -c 'ls -l /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615'
Starting pod/ip-10-0-54-148eu-central-1computeinternal-debug-2t2xs ...
To use host binaries, run `chroot /host`
lrwxrwxrwx. 1 root root 13 Feb 28 20:29 /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615 -> ../../nvme1n1
So we know that the device is there in the pod. Yet the local disk never gets created. Its status outputs:
❯ oc get localdisk -n ibm-spectrum-scale shareddisk1 -o jsonpath="{.status}" | jq .
{
"conditions": [
{
"lastTransitionTime": "2025-02-28T21:09:50Z",
"message": "The local disk can be used by a filesystem.",
"reason": "DiskNotUsed",
"status": "False",
"type": "Used"
},
{
"lastTransitionTime": "2025-02-28T21:09:53Z",
"message": "Device /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615 does not exist on node ip-10-0-54-148. Local disk cannot be created. Will try again.",
"reason": "DeviceDoesNotExist",
"status": "False",
"type": "Ready"
},
{
"lastTransitionTime": "2025-02-28T21:09:53Z",
"message": "",
"reason": "Unknown",
"status": "Unknown",
"type": "Available"
}
],
"failuregroup": "",
"failuregroupMapping": "node=ip-10-0-54-148.eu-central-1.compute.internal",
"filesystem": "",
"nodes": {
"NSDServers": "",
"connections": ""
},
"pool": "",
"size": "",
"type": ""
}
It seems that the operator actually checks if the device we passed to
the localdisk shows up in /proc/partitions
. But
/proc/partitions
holds the real file and not the symlink,
hence the failure? The manager.log shows many lines like:
/tmp/logs/gpfs-test.aws.validatedpatterns.io/ibm-spectrum-scale-controller-manager-d65bc9b87-cvs2f-manager.log:2025-02-28T21:08:44.049Z ERROR /usr/lpp/mmfs/bin/mmcrnsd returned non-zero returncode {"controller": "LocalDisk", "namespace": "ibm-spectrum-scale", "name": "shareddisk1", "reconcileID": "cffd257a-f4f5-4804-a787-ccaf97416492", "command": "/usr/lpp/mmfs/bin/mmcrnsd -F nsdStanza_vTjrMX -v no", "returncode": 1, "stdout": "mmcrnsd: Processing disk disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615\n", "stderr": "mmcrnsd: disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615 was not found in /proc/partitions.\nmmcrnsd: Failed while processing disk stanza on node ip-10-0-54-148.admin.ibm-spectrum-scale.stg.gpfs-test.aws.validatedpatterns.io..\n %nsd: device=/dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol019ce0561832fd615\n nsd=shareddisk1\n servers=ip-10-0-54-148.admin.ibm-spectrum-scale.stg.gpfs-test.aws.validatedpatterns.io.\n usage=dataAndMetadata\n failureGroup=0\n pool=system\n thinDiskType=no\nmmcrnsd: Command failed. Examine previous error messages to determine cause.\n", "error": "command terminated with exit code 1"}
Log files from all containers in all ibm file systems are here