general protection fault

Hi,

I ended up in the following situation this morning on my x86_64 VM running Harlequin with all latest updates:

Feb 27 03:36:44 vprod kernel: general protection fault: 0000 #1] SMP
Feb 27 03:36:44 vprod kernel: Modules linked in: sha256_ssse3 sha256_generic dm_crypt algif_skcipher af_alg nfs fscache nfsd auth_rpcgss oid_registry iscsi_ibft iscsi_boot_sysfs nfs_acl lockd sunrpc af_packet hid_generic xfs libcrc32c coretemp uas usb_storage crct10dif_pclmul crc32_pclmul usbhid ghash_clmulni_intel aesni_intel vmwgfx aes_x86_64 lrw gf128mul ppdev glue_helper ablk_helper cryptd serio_raw vmxnet3 vmw_balloon pcspkr ttm drm_kms_helper vmw_pvscsi i2c_piix4 mptctl vmw_vmci drm shpchp processor battery ac button parport_pc parport dm_mod btrfs xor raid6_pq ata_generic crc32c_intel uhci_hcd ehci_pci ehci_hcd usbcore usb_common ata_piix sr_mod cdrom mptspi scsi_transport_spi mptscsih mptbase floppy sg
Feb 27 03:36:44 vprod kernel: CPU: 2 PID: 1385 Comm: nfsd Not tainted 3.16.7-7-default #1
Feb 27 03:36:44 vprod kernel: Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/14/2014
Feb 27 03:36:44 vprod kernel: task: ffff8808052f6350 ti: ffff8808047c0000 task.ti: ffff8808047c0000
Feb 27 03:36:44 vprod kernel: RIP: 0010:<ffffffff811c9598>] <ffffffff811c9598>] __d_lookup+0x68/0x150
Feb 27 03:36:44 vprod kernel: RSP: 0018:ffff8808047c3c50 EFLAGS: 00010202
Feb 27 03:36:44 vprod kernel: RAX: ffffc90001e89218 RBX: 796ca1b4977da057 RCX: 000000000000000a
Feb 27 03:36:44 vprod kernel: RDX: ffffc90000020000 RSI: ffff8808047c3d10 RDI: ffff88018d03a558
Feb 27 03:36:44 vprod kernel: RBP: ffff88018d03a558 R08: 0000000131400000 R09: fffffffecec04c50
Feb 27 03:36:44 vprod kernel: R10: 0000000000000002 R11: 00000000000003e8 R12: ffff8808047c3d10
Feb 27 03:36:44 vprod kernel: R13: 0000000000000002 R14: 00000000117ffffe R15: ffff8800aa92108c
Feb 27 03:36:44 vprod kernel: FS: 0000000000000000(0000) GS:ffff88083fc40000(0000) knlGS:0000000000000000
Feb 27 03:36:44 vprod kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Feb 27 03:36:44 vprod kernel: CR2: 00007fb374d66528 CR3: 0000000131b00000 CR4: 00000000001407e0
Feb 27 03:36:44 vprod kernel: Stack:
Feb 27 03:36:44 vprod kernel: ffff8800aa92108c ffff880207865f40 0000000000d56a1e ffff8808047c3d10
Feb 27 03:36:44 vprod kernel: ffff88018d03a558 ffff8808047c3cf7 0000000000000000 ffff8800aa92108c
Feb 27 03:36:44 vprod kernel: ffffffff811c96a5 0000000000000000 ffff88018d03a558 ffff8808047c3d10
Feb 27 03:36:44 vprod kernel: Call Trace:
Feb 27 03:36:44 vprod kernel: <ffffffff811c96a5>] d_lookup+0x25/0x40
Feb 27 03:36:44 vprod kernel: <ffffffff811bafbb>] lookup_dcache+0x2b/0xc0
Feb 27 03:36:44 vprod kernel: <ffffffff811bb06a>] __lookup_hash+0x1a/0x40
Feb 27 03:36:44 vprod kernel: <ffffffff811bbd6d>] lookup_one_len+0xcd/0x120
Feb 27 03:36:44 vprod kernel: <ffffffffa05ccfae>] nfsd_lookup_dentry+0x11e/0x490 [nfsd]
Feb 27 03:36:44 vprod kernel: <ffffffffa05cd379>] nfsd_lookup+0x59/0x120 [nfsd]
Feb 27 03:36:44 vprod kernel: <ffffffffa05dbb58>] nfsd4_proc_compound+0x4e8/0x7d0 [nfsd]
Feb 27 03:36:44 vprod kernel: <ffffffffa05c8d32>] nfsd_dispatch+0xb2/0x200 [nfsd]
Feb 27 03:36:44 vprod kernel: <ffffffffa0570afb>] svc_process_common+0x41b/0x670 [sunrpc]
Feb 27 03:36:44 vprod kernel: <ffffffffa0570e5c>] svc_process+0x10c/0x160 [sunrpc]
Feb 27 03:36:44 vprod kernel: <ffffffffa05c86ef>] nfsd+0xbf/0x130 [nfsd]
Feb 27 03:36:44 vprod kernel: <ffffffff8107c4ad>] kthread+0xbd/0xe0
Feb 27 03:36:44 vprod kernel: <ffffffff815d0dbc>] ret_from_fork+0x7c/0xb0
Feb 27 03:36:44 vprod kernel: Code: 48 c1 e8 06 44 01 f0 69 c0 01 00 37 9e d3 e8 48 8d 04 c2 48 8b 18 48 83 e3 fe 75 0f eb 35 0f 1f 44 00 00 48 8b 1b 48 85 db 74 28 <44> 39 73 18 75 f2 4c 8d 7b 50 4c 89 ff e8 b6 73 40 00 48 39 6b
Feb 27 03:36:44 vprod kernel: RIP <ffffffff811c9598>] __d_lookup+0x68/0x150
Feb 27 03:36:44 vprod kernel: RSP <ffff8808047c3c50>
Feb 27 03:36:44 vprod kernel: — end trace 9181a4ed2dcbebbd ]—

It’s the first time it happened - the VM is almost 2 months old. NFS server rev 1.3.0-4.2.1.
I ended up having to reboot the VM.
Any ideas (other than to file a bug)?

Thanks

Igor

vSphere 5.5? Workstation? Fusion?

You need to

  1. Describe your setup in detail. VMware products, the HostOS, hardware, other loads at time of the GPF.
  2. Oftentimes you need to inspect at least a few (dozen or more?) lines before you get a GPF/Halt type of error, both to verify all was well and there was no minor issue that led to the GPF and also to verify where in the process the error occured.

All of the above are essential both for anyone in these Forums to help, and also should you need to report an issue (the issue has to be reproducible).

Also, I assume from your description this is when a Guest (13.2) boots.

More than likely this is a VMware or hardware issue unrelated to openSUSE at this point. In fact, it’s also more than likely a Host platform issue, not a Guest issue.

TSU

Same stack appears on a Debian 8 machine, running on a HP server without virtualization, kernel 3.16.0-4-amd64

My previous comment applies equally to your situation.
GPFs are very rare nowadays, and most often are the result of a hardware issue, and I might also recommend doing a memory test or re-seating the memory (after cleaning with a pencil eraser).

Also, depending on the size and model of your server you may have many memory slots. If all your memory sticks aren’t exactly identical (ie bought from the same lot), then you may need to carefully inspect the specs of each stick and your HP machine documentation to determine the exact order each stick should be installed (and even then nothing may be guaranteed if the memory sticks are dissimilar).

TSU