Uploaded image for project: 'vpp'
  1. vpp
  2. VPP-178

VPP stuck at boot due to SVM deadlock

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: Medium Medium
    • None
    • None
    • None
    • None

      Sometimes, after VPP crashed, it cannot be restarted.
      It boots and gets stuck at some point.

      Here are some logs. Notice this part:
      'svm_map_region:580: region /global_vm mutex held by dead pid 3464, tag 4, force unlock
      svm_map_region:588: recovery: attempt to re-lock region
      svm_map_region:595: recovery: attempt svm_data_region_map
      svm_map_region:603: unlock and continue'

      VPP detects something is wrong, but apparently fails to solve it without notifying the user.

      Here are the full logs.

      Running: sudo gdb -x /tmp/vpp.sh.gdbinit --args /home/ppfister/vpp-dev/build/open-vpp-mirror/build-root/install-vpp-native/vpp/bin/vpp cpu

      { workers 1 }

      unix

      { interactive cli-history-limit 500 }

      api-trace

      { on }

      dpdk

      { socket-mem 1024,1024 coremask 3 no-pci }

      GNU gdb (Ubuntu 7.11-0ubuntu1) 7.11
      Copyright (C) 2016 Free Software Foundation, Inc.
      License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law. Type "show copying"
      and "show warranty" for details.
      This GDB was configured as "x86_64-linux-gnu".
      Type "show configuration" for configuration details.
      For bug reporting instructions, please see:
      <http://www.gnu.org/software/gdb/bugs/>.
      Find the GDB manual and other documentation resources online at:
      <http://www.gnu.org/software/gdb/documentation/>.
      For help, type "help".
      Type "apropos word" to search for commands related to "word"...
      Reading symbols from /home/ppfister/vpp-dev/build/open-vpp-mirror/build-root/install-vpp-native/vpp/bin/vpp...done.
      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      vlib_plugin_early_init:206: plugin path /usr/lib/vpp_plugins
      svm_map_region:580: region /global_vm mutex held by dead pid 3464, tag 4, force unlock
      svm_map_region:588: recovery: attempt to re-lock region
      svm_map_region:595: recovery: attempt svm_data_region_map
      svm_map_region:603: unlock and continue
      EAL: Detected lcore 0 as core 0 on socket 0
      EAL: Detected lcore 1 as core 1 on socket 0
      EAL: Detected lcore 2 as core 2 on socket 0
      EAL: Detected lcore 3 as core 3 on socket 0
      EAL: Detected lcore 4 as core 4 on socket 0
      EAL: Detected lcore 5 as core 5 on socket 0
      EAL: Detected lcore 6 as core 0 on socket 1
      EAL: Detected lcore 7 as core 1 on socket 1
      EAL: Detected lcore 8 as core 2 on socket 1
      EAL: Detected lcore 9 as core 3 on socket 1
      EAL: Detected lcore 10 as core 4 on socket 1
      EAL: Detected lcore 11 as core 5 on socket 1
      EAL: Detected lcore 12 as core 0 on socket 0
      EAL: Detected lcore 13 as core 1 on socket 0
      EAL: Detected lcore 14 as core 2 on socket 0
      EAL: Detected lcore 15 as core 3 on socket 0
      EAL: Detected lcore 16 as core 4 on socket 0
      EAL: Detected lcore 17 as core 5 on socket 0
      EAL: Detected lcore 18 as core 0 on socket 1
      EAL: Detected lcore 19 as core 1 on socket 1
      EAL: Detected lcore 20 as core 2 on socket 1
      EAL: Detected lcore 21 as core 3 on socket 1
      EAL: Detected lcore 22 as core 4 on socket 1
      EAL: Detected lcore 23 as core 5 on socket 1
      EAL: Support maximum 256 logical core(s) by configuration.
      EAL: Detected 24 lcore(s)
      EAL: No free hugepages reported in hugepages-1048576kB
      EAL: Setting up physically contiguous memory...
      EAL: Ask a virtual area of 0xa800000 bytes
      EAL: Virtual area found at 0x7fff8c400000 (size = 0xa800000)
      EAL: Ask a virtual area of 0x200000 bytes
      EAL: Virtual area found at 0x7fff8c000000 (size = 0x200000)
      EAL: Ask a virtual area of 0x3e800000 bytes
      EAL: Virtual area found at 0x7fff4d600000 (size = 0x3e800000)
      EAL: Ask a virtual area of 0x200000 bytes
      EAL: Virtual area found at 0x7fff4d200000 (size = 0x200000)
      EAL: Ask a virtual area of 0x1b6c00000 bytes
      EAL: Virtual area found at 0x7ffd96400000 (size = 0x1b6c00000)
      EAL: Ask a virtual area of 0x1ff800000 bytes
      EAL: Virtual area found at 0x7ffb96a00000 (size = 0x1ff800000)
      EAL: Ask a virtual area of 0x400000 bytes
      EAL: Virtual area found at 0x7ffb96400000 (size = 0x400000)
      EAL: Ask a virtual area of 0x200000 bytes
      EAL: Virtual area found at 0x7ffb96000000 (size = 0x200000)
      EAL: Ask a virtual area of 0x200000 bytes
      EAL: Virtual area found at 0x7ffb95c00000 (size = 0x200000)
      EAL: Requesting 512 pages of size 2MB from socket 0
      EAL: Requesting 512 pages of size 2MB from socket 1
      EAL: TSC frequency is ~3391783 KHz
      EAL: Master lcore 0 is ready (tid=f7fe08c0;cpuset=[0])
      EAL: lcore 1 is ready (tid=9d09e700;cpuset=[1])
      DPDK physical memory layout:
      Segment 0: phys:0x2ac00000, len:176160768, virt:0x7fff8c400000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
      Segment 1: phys:0x35600000, len:2097152, virt:0x7fff8c000000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
      Segment 2: phys:0x36400000, len:895483904, virt:0x7fff4d600000, socket_id:0, hugepage_sz:2097152, nchannel:0, nrank:0
      Segment 3: phys:0x1e2cc00000, len:1073741824, virt:0x7ffb96a00000, socket_id:1, hugepage_sz:2097152, nchannel:0, nrank:0

      Thread 1 "vpp_main" received signal SIGINT, Interrupt.
      __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
      135 ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S: No such file or directory.
      (gdb) bt
      #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
      #1 0x00007ffff62d1dfd in _GI__pthread_mutex_lock (mutex=mutex@entry=0x30444008) at ../nptl/pthread_mutex_lock.c:80
      #2 0x00007ffff6e7ea26 in region_lock (rp=rp@entry=0x30444000, tag=tag@entry=2)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../svm/svm.c:62
      #3 0x00007ffff6e7f946 in svm_map_region (a=a@entry=0x7fffc5477e60) at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../svm/svm.c:590
      #4 0x00007ffff6e800b7 in svm_region_find_or_create (a=a@entry=0x7fffc5477e60)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../svm/svm.c:724
      #5 0x00007ffff79b0c34 in vl_map_shmem (region_name=0x5b1b28 "/vpe-api", is_vlib=is_vlib@entry=1)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib-api/vlibmemory/memory_shared.c:235
      #6 0x00007ffff79b70df in memory_api_init (region_name=<optimized out>)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib-api/vlibmemory/memory_vlib.c:310
      #7 memclnt_process (vm=0x8d4080 <vlib_global_main>, node=0x7fffc546f000, f=<optimized out>)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib-api/vlibmemory/memory_vlib.c:365
      #8 0x00007ffff7541436 in vlib_process_bootstrap (_a=<optimized out>)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib/vlib/main.c:1177
      #9 0x00007ffff68167d0 in clib_calljmp () at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vppinfra/vppinfra/longjmp.S:110
      #10 0x00007fffc59a0e30 in ?? ()
      #11 0x00007ffff75423e9 in vlib_process_startup (f=0x0, p=0x7fffc546f000, vm=0x8d4080 <vlib_global_main>)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib/vlib/main.c:1201
      #12 dispatch_process (vm=0x8d4080 <vlib_global_main>, p=0x7fffc546f000, last_time_stamp=878648214215779, f=0x0)
      at /home/ppfister/vpp-dev/build/open-vpp-mirror/build-data/../vlib/vlib/main.c:1246
      #13 0x0000000000000004 in ?? ()
      #14 0x0000000000000000 in ?? ()
      (gdb)

            Unassigned Unassigned
            ppfister Pierre Pfister
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: