The Mayastor process has been sent the SIGILL signal as the result of attempting to execute an illegal instruction. This indicates that the host node's CPU does not satisfy the prerequisite instruction set level for Mayastor (SSE4.2 on x86-64).
In addition to ensuring that the general prerequisites for installation are met, it is necessary to add the following directory mapping to the
services_kublet->extra_bindssection of the cluster's
If this is not done, CSI socket paths won't match expected values and the Mayastor CSI driver registration process will fail, resulting in the inability to provision Mayastor volumes on the cluster.
If the disk device used by a Mayastor pool becomes inaccessible or enters the offline state, the hosting Mayastor pod may panic. A fix for this behaviour is under investigation.
When rebooting a node that runs applications mounting Mayastor volumes, this can take tens of minutes. The reason is the long default NVMe controller timeout (
ctrl_loss_tmo). The solution is to follow the best k8s practices and cordon the node ensuring there aren't any application pods running on it before the reboot. Setting
ioTimeoutstorage class parameter can be used to fine-tune the timeout.
Deploying an application pod on a worker node which hosts Mayastor and Prometheus exporter causes that node to restart. The issue originated because of a kernel bug. Once the nexus disconnects, the entries under
/host/sys/class/hwmon/should get removed, which does not happen in this case(The issue was fixed via this kernel patch).
Fix: Use kernel version 5.13 or later if deploying Mayastor in conjunction with the Prometheus metrics exporter.