Uploaded image for project: 'vpp'
  1. vpp
  2. VPP-907

VPP with worker threads crashes on 4K VXLAN/BD setup

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Medium Medium
    • 17.07
    • None
    • L2
    • None

      VPP running with 2 worker threads may crash on the performance testbed with 4K BDs with 4K VXLAN tunnels setup. The crash is dependent upon how traffic is started on the L2 to VXLAN encap direction. This is the traceback from the core file:

      [Thread debugging using libthread_db enabled]
      Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
      Core was generated by `vpp -c /etc/vpp/startup.conf'.
      Program terminated with signal SIGABRT, Aborted.
      #0  0x00007f7c31002c37 in read_alias_file (fname=<optimized out>,
          fname_len=<optimized out>) at localealias.c:335
      335    localealias.c: No such file or directory.
      (gdb) bt
      #0  0x00007f7c31002c37 in read_alias_file (fname=<optimized out>,
          fname_len=<optimized out>) at localealias.c:335
      #1  0x00007f7c32b43634 in unix_signal_handler (signum=11, si=0x7f7bf18e2570,
          uc=0x7f7bf18e2440) at /scratch/loj/vpp1707/build-data/../src/vlib/unix/main.c:118
      #2  <signal handler called>
      #3  0x00007f7c32ad0665 in vlib_get_frame_no_check (vm=0x0, frame_index=4294967295)
          at /scratch/loj/vpp1707/build-data/../src/vlib/node_funcs.h:221
      #4  0x00007f7c32ad078d in vlib_get_frame (vm=0x7f7bf12b7d1c, frame_index=4294967295)
          at /scratch/loj/vpp1707/build-data/../src/vlib/node_funcs.h:241
      #5  0x00007f7c32ad2b2f in vlib_put_next_frame (vm=0x7f7bf12b7d1c, r=0x7f7bf5fc6e9c,
          next_index=0, n_vectors_left=0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:472
      #6  0x00007f7c32046548 in l2output_node_inline (vm=0x7f7bf12b7d1c, node=0x7f7bf5fc6e9c,
          frame=0x7f7bf3a05600, do_trace=0)
          at /scratch/loj/vpp1707/build-data/../src/vnet/l2/l2_output.c:441
      #7  0x00007f7c320465e3 in l2output_node_fn (vm=0x7f7bf12b7d1c, node=0x7f7bf5fc6e9c,
          frame=0x7f7bf3a05600) at /scratch/loj/vpp1707/build-data/../src/vnet/l2/l2_output.c:453
      #8  0x00007f7c32ad40ff in dispatch_node (vm=0x7f7bf12b7d1c, node=0x7f7bf5fc6e9c,
          type=VLIB_NODE_TYPE_INTERNAL, dispatch_state=VLIB_NODE_STATE_POLLING,
          frame=0x7f7bf3a05600, last_time_stamp=11817859168407334)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:1016
      #9  0x00007f7c32ad46b8 in dispatch_pending_node (vm=0x7f7bf12b7d1c, pending_frame_index=5,
          last_time_stamp=11817859168407334)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:1166
      #10 0x00007f7c32ad6734 in vlib_main_or_worker_loop (vm=0x7f7bf12b7d1c, is_main=0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:1625
      #11 0x00007f7c32ad6828 in vlib_worker_loop (vm=0x7f7bf12b7d1c)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:1650
      #12 0x00007f7c32b2021c in vlib_worker_thread_fn (arg=0x7f7bf0725ac0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/threads.c:1378
      #13 0x00007f7c31809190 in clib_calljmp ()
          at /scratch/loj/vpp1707/build-data/../src/vppinfra/longjmp.S:110
      #14 0x00007f7b219fcd40 in ?? ()
      #15 0x00007f7c32b1b753 in vlib_worker_thread_bootstrap_fn (arg=0x7f7bf0725ac0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/threads.c:464
      Backtrace stopped: previous frame inner to this frame (corrupt stack?)
      (gdb) frame 6
      #6  0x00007f7c32046548 in l2output_node_inline (vm=0x7f7bf12b7d1c, node=0x7f7bf5fc6e9c,
          frame=0x7f7bf3a05600, do_trace=0)
          at /scratch/loj/vpp1707/build-data/../src/vnet/l2/l2_output.c:441
      (gdb) p cached_sw_if_index
      $1 = 2564
      (gdb) p cached_next_index
      $2 = 0
      (gdb) down
      #5  0x00007f7c32ad2b2f in vlib_put_next_frame (vm=0x7f7bf12b7d1c, r=0x7f7bf5fc6e9c,
          next_index=0, n_vectors_left=0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:472
      (gdb) down
      #4  0x00007f7c32ad078d in vlib_get_frame (vm=0x7f7bf12b7d1c, frame_index=4294967295)
          at /scratch/loj/vpp1707/build-data/../src/vlib/node_funcs.h:241
      (gdb) up
      #5  0x00007f7c32ad2b2f in vlib_put_next_frame (vm=0x7f7bf12b7d1c, r=0x7f7bf5fc6e9c,
          next_index=0, n_vectors_left=0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:472
      (gdb) up
      #6  0x00007f7c32046548 in l2output_node_inline (vm=0x7f7bf12b7d1c, node=0x7f7bf5fc6e9c,
          frame=0x7f7bf3a05600, do_trace=0)
          at /scratch/loj/vpp1707/build-data/../src/vnet/l2/l2_output.c:441
      (gdb) p *node
      $3 = {cacheline0 = 0x7f7bf5fc6e9c "ye\004\062|\177",
        function = 0x7f7c32046579 <l2output_node_fn>, errors = 0x7f7bf0acfd50,
        clocks_since_last_overflow = 0, max_clock = 763318, max_clock_n = 256,
        calls_since_last_overflow = 0, vectors_since_last_overflow = 0, next_frame_index = 870,
        node_index = 292, input_main_loops_per_call = 0,
        main_loop_count_last_dispatch = 2747018781, main_loop_vector_stats = {1024, 0},
        flags = 0, state = 0, n_next_nodes = 9, cached_next_index = 0, thread_index = 2,
        runtime_data = 0x7f7bf5fc6ee2 ""}
      (gdb) down
      #5  0x00007f7c32ad2b2f in vlib_put_next_frame (vm=0x7f7bf12b7d1c, r=0x7f7bf5fc6e9c,
          next_index=0, n_vectors_left=0)
          at /scratch/loj/vpp1707/build-data/../src/vlib/main.c:472
      (gdb) p *nf
      $4 = {frame_index = 4294967295, node_runtime_index = 281, flags = 0,
        vectors_since_last_overflow = 0}

            lojohn John Lo
            lojohn John Lo
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: