-
Bug
-
Resolution: Fixed
-
Major
-
Flounder
-
None
-
M3ULCB
Kingfisher M04
SSD WDC WDS512G1X0C-00ENX0
AGL/master, R-Car BSP 3.4.0
Writing on NVME device partition leads to a severe kernel crash (not recoverable).
To reproduce:
- install the SSD M.2 NVME device on the Kingfisher board
- boot on AGL/master image
- log as root
- run the following commands:
# partition the SSD device and create 1 big partition fdisk /dev/nvme0n1 # format the partition mkfs.ext4 /dev/nvme0n1p1 # mount the partition mount /dev/nvme0n1p1 /mnt # do some writes (command may have to be repeated) while true; do date; dd if=/dev/zero of=/mnt/bar bs=4M count=500; sync; done ...
This sometimes lead to a kernel crash:
[ 193.429021] Unable to handle kernel paging request at virtual address dead000000000108 [ 193.436940] Mem abort info: [ 193.439729] Exception class = DABT (current EL), IL = 32 bits [ 193.445642] SET = 0, FnV = 0 [ 193.448689] EA = 0, S1PTW = 0 [ 193.451822] Data abort info: [ 193.454696] ISV = 0, ISS = 0x00000044 [ 193.458524] CM = 0, WnR = 1 [ 193.461484] [dead000000000108] address between user and kernel address ranges [ 193.468614] Internal error: Oops: 96000044 [#1] PREEMPT SMP [ 193.474181] Modules linked in: rfcomm bnep crc32_ce crct10dif_ce nvme nvme_core btusb btrtl btbcm btintel pvrsrvk m(O) rcar_can can_dev btwilink bluetooth st_drv ecdh_generic rfkill vspm_if(O) vsp2(O) vspm(O) uvcs_drv(O) mmngrbuf( O) mmngr(O) [ 193.495515] CPU: 1 PID: 4255 Comm: dd Tainted: G B O 4.14.0-yocto-standard #1 [ 193.503598] Hardware name: Renesas M3ULCB Kingfisher board based on r8a7796 (DT) [ 193.510987] task: ffff8005e5568e00 task.stack: ffff000022978000 [ 193.516910] PC is at __rmqueue+0x3cc/0x4c8 [ 193.521001] LR is at get_page_from_freelist+0x5e8/0xa20 [ 193.526221] pc : [<ffff00000818e9ac>] lr : [<ffff00000818fef0>] pstate: a00001c5 [ 193.533608] sp : ffff00002297b850 [ 193.536916] x29: ffff00002297b850 x28: ffff8005fff7df90 [ 193.542224] x27: 0000000000000001 x26: 00007ffa00082040 [ 193.547531] x25: fffffffffffffef0 x24: ffff8005fff7dfc0 [ 193.552837] x23: ffff8005fff7df80 x22: 0000000000000001 [ 193.558144] x21: ffff8005fff7de80 x20: 0000000000000000 [ 193.563450] x19: 0000000000000010 x18: 0000000000000002 [ 193.568756] x17: 0000ffff96eab7a8 x16: ffff0000082108d8 [ 193.574062] x15: 0000000000000000 x14: 0000000000000000 [ 193.579368] x13: 0000000000000000 x12: 0000000000000000 [ 193.584674] x11: 0000000000000000 x10: 00000000ffffff80 [ 193.589981] x9 : ffff7e0000710020 x8 : dead000000000100 [ 193.595288] x7 : ffff8005fff7e3a0 x6 : dead000000000100 [ 193.600594] x5 : ffff8005fff7e3d0 x4 : 0000000000000410 [ 193.605900] x3 : 000000000000000a x2 : 000000000000000a [ 193.611206] x1 : ffff8005fff7e3d0 x0 : ffff7e0000710000 [ 193.616514] Process dd (pid: 4255, stack limit = 0xffff000022978000) [ 193.622860] Call trace: [ 193.625301] Exception stack(0xffff00002297b710 to 0xffff00002297b850) [ 193.631735] b700: ffff7e0000710000 ffff8005fff7e3d0 [ 193.639559] b720: 000000000000000a 000000000000000a 0000000000000410 ffff8005fff7e3d0 [ 193.647381] b740: dead000000000100 ffff8005fff7e3a0 dead000000000100 ffff7e0000710020 [ 193.655203] b760: 00000000ffffff80 0000000000000000 0000000000000000 0000000000000000 [ 193.663026] b780: 0000000000000000 0000000000000000 ffff0000082108d8 0000ffff96eab7a8 [ 193.670848] b7a0: 0000000000000002 0000000000000010 0000000000000000 ffff8005fff7de80 [ 193.678670] b7c0: 0000000000000001 ffff8005fff7df80 ffff8005fff7dfc0 fffffffffffffef0 [ 193.686492] b7e0: 00007ffa00082040 0000000000000001 ffff8005fff7df90 ffff00002297b850 [ 193.694314] b800: ffff00000818fef0 ffff00002297b850 ffff00000818e9ac 00000000a00001c5 [ 193.702136] b820: ffff8005fd850d80 ffff7e00002e5f00 0001000000000000 0000000000000028 [ 193.709957] b840: ffff00002297b850 ffff00000818e9ac [ 193.714832] [<ffff00000818e9ac>] __rmqueue+0x3cc/0x4c8 [ 193.719965] [<ffff00000818fef0>] get_page_from_freelist+0x5e8/0xa20 [ 193.726227] [<ffff0000081908e8>] __alloc_pages_nodemask+0xd8/0xbf0 [ 193.732403] [<ffff0000081e3c7c>] alloc_pages_current+0x7c/0xe8 [ 193.738230] [<ffff000008186f58>] __page_cache_alloc+0x98/0xb8 [ 193.743970] [<ffff000008187020>] pagecache_get_page+0xa8/0x280 [ 193.749796] [<ffff00000818721c>] grab_cache_page_write_begin+0x24/0x40 [ 193.756318] [<ffff0000082b2ec0>] ext4_da_write_begin+0xb8/0x3b0 [ 193.762230] [<ffff000008186d48>] generic_perform_write+0x90/0x178 [ 193.768317] [<ffff000008189b20>] __generic_file_write_iter+0x100/0x1c8 [ 193.774839] [<ffff0000082a07e4>] ext4_file_write_iter+0x10c/0x408 [ 193.780928] [<ffff000008210444>] __vfs_write+0xac/0x118 [ 193.786146] [<ffff000008210688>] vfs_write+0xa0/0x190 [ 193.791191] [<ffff000008210920>] SyS_write+0x48/0xb0 [ 193.796148] Exception stack(0xffff00002297bec0 to 0xffff00002297c000) [ 193.802582] bec0: 0000000000000001 0000ffff969ea000 0000000000400000 0000ffff96f3c000 [ 193.810405] bee0: 0000000000400000 0000000000000000 0000ffff969ea000 0000aaaad769fb00 [ 193.818226] bf00: 0000000000000040 0000ffff96f7b260 0000000000010080 0000000000000000 [ 193.826048] bf20: 0000000000000001 000000000000270f 0000000000002010 0000000000000000 [ 193.833870] bf40: 0000aaaad76bae38 0000ffff96eab7a8 0000000000000002 0000000000000001 [ 193.841692] bf60: 0000000000400000 0000aaaad76bb130 0000000000000000 0000ffff969ea000 [ 193.849513] bf80: 0000aaaad76bb000 0000000000000001 0000ffff96f7b280 00000000000000e5 [ 193.857335] bfa0: 0000000000400000 0000ffffe2c71b70 0000aaaad769ff94 0000ffffe2c71b70 [ 193.865157] bfc0: 0000ffff96eab7d0 0000000060000000 0000000000000001 0000000000000040 [ 193.872979] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 193.880804] [<ffff000008083808>] __sys_trace_return+0x0/0x4 [ 193.886371] Code: f1008120 54000740 a9401526 2a0303e2 (f90004c5) [ 193.892462] ---[ end trace 01f5d11d2793b9dd ]--- [ 193.897155] note: dd[4255] exited with preempt_count 1
Sometimes, there are some limits reached:
[ 1122.421243] nvme nvme0: async event result 00010300 ... [ 1152.989240] nvme nvme0: controller is down; will reset: CSTS=0x3, PCI_STATUS=0x10 [ 1153.254930] nvme nvme0: Shutdown timeout set to 60 seconds [ 1153.260451] nvme nvme0: NPSS is invalid; not using APST [ 1153.265721] nvme nvme0: min host memory (2105376 MiB) above limit (128 MiB).