I've confirmed that the udev and kernel ACTION=change event on /dev/sda is the same across the failing and succeeding instances (except for some timestamps which I edited out of these two documents, and the ordering/naming of the DEVLINKS):
After the /dev/sda udev event is logged, each partition on /dev/sda {sda1, sda14, sda15} then gets an ACTION=change udev event logged. One difference between the two instances is the order of these events. On the failing system, the order is (sda1, sda15, sda14), and on the succeeding system, it's (sda15, sda14, sda1). This vaguely suggests a race is involved, as the timing is different, but it's certainly not categorical. It's worth noting that the _kernel_ events (which are all emitted before even the sda udev event is processed) happen in the same order on each instance: sda, sda1, sda14, sda15.
I've confirmed that the udev and kernel ACTION=change event on /dev/sda is the same across the failing and succeeding instances (except for some timestamps which I edited out of these two documents, and the ordering/naming of the DEVLINKS):
$ diff -u failing.sda_event success.sda_event /dev/disk/ by-id/wwn- 0x60022480d7a48 72310ff8d8a6789 eec9 /dev/disk/ azure/root /dev/disk/ by-path/ acpi-VMBUS: 01-vmbus- 000000000000889 900000000000000 00-lun- 0 /dev/disk/ by-id/scsi- 14d534654202020 20d7a4872310ff4 1908c3b8d8a6789 eec9 /dev/disk/ cloud/azure_ root /dev/disk/ by-id/scsi- 360022480d7a487 2310ff8d8a6789e ec9 /dev/disk/ by-path/ acpi-VMBUS: 01-vmbus- 000000000000889 900000000000000 00-lun- 0 /dev/disk/ azure/root /dev/disk/ by-id/scsi- 360022480d7a487 2310ff8d8a6789e ec9 /dev/disk/ by-id/wwn- 0x60022480d7a48 72310ff8d8a6789 eec9 /dev/disk/ by-id/scsi- 14d534654202020 20d7a4872310ff4 1908c3b8d8a6789 eec9 /dev/disk/ cloud/azure_ root
--- failing.sda_event 2019-07-26 10:25:13.088884853 -0400
+++ success.sda_event 2019-07-26 10:23:51.465073329 -0400
@@ -41,5 +41,5 @@
ID_FS_TYPE=
MAJOR=8
MINOR=0
-DEVLINKS=
+DEVLINKS=
TAGS=:systemd:
After the /dev/sda udev event is logged, each partition on /dev/sda {sda1, sda14, sda15} then gets an ACTION=change udev event logged. One difference between the two instances is the order of these events. On the failing system, the order is (sda1, sda15, sda14), and on the succeeding system, it's (sda15, sda14, sda1). This vaguely suggests a race is involved, as the timing is different, but it's certainly not categorical. It's worth noting that the _kernel_ events (which are all emitted before even the sda udev event is processed) happen in the same order on each instance: sda, sda1, sda14, sda15.