Uploaded image for project: 'csit'
  1. csit
  2. CSIT-1928

ipsec swasync: VPP can become unresponsive

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: High High
    • None
    • None
    • None

      This seems to happen when there is not enough crypto workers for the load, so mostly 2c sometimes 3c test, only 1518b (imix passes). 3n-alt seems to be affected more heavily (both algos), 3n-icx less (aes256gcm passes).

      So far I think CSIT is doing nothing wrong, but VPP has changed its behavior.
      First, this [0] change caused crypto-dispatch node to run also on VPP main thread (not only on workers as before).
      There is crypto_sw_scheduler_set_worker API message, but it is unable to affect main VPP thread (argument worker_index=0 is already the first worker).
      Then this [1] change change aimed to improve the performance, but made stalling even worse, especially on 3n-alt.

      To be fair, crypto-dispatch now starts in adaptive mode, so it starts polling only under load. And unix-epoll-input should be "polling" (checking each 1024 loop cycles), so VPP main thread should be still somewhat responsive (tests fail if vppctl command respond within 10 minutes).
      Perhaps VPP has two bugs, one with VPP main thread doing crypto work, other with VPP stopping completely if VPP main thread does any polling and CPY heavy work.

      Some logs:
      [2] shows recent 3n-alt, VPP can get stuck some time after show run (on 3n-icx if it fails it hangs already on clear runtime).
      [3] is a good run showing no crypto-dispatch on vpp_main before [0].

      Trending [4] shows [0] is big regression for PDR and small regression for MRR. Impact of [1] is negligible.

      [0] https://gerrit.fd.io/r/c/vpp/+/38453
      [1] https://gerrit.fd.io/r/c/vpp/+/38926
      [2] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-alt/260/log.html.gz#s1-s1-s1-s1-s13-t1-k2-k14-k9-k10-k1-k1-k1-k12
      [3] https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-vpp-perf-mrr-daily-master-3n-icx/272/log.html.gz#s1-s1-s1-s1-s12-t1-k2-k14-k9-k10-k1-k1-k1-k12
      [4] https://csit.fd.io/trending/#eNrtVkFqwzAQfI17KQuWZNW-9NDU_yjKehMLbFmVlDTO6-u4AcWHQigmvegiHWa1M8wwIB8GRx-eutdMbrJyk_FSN9ORibfn6TpaC8KAxhOwPN8Tt4wqluMnqOMO0I02DMAkq7bAESi02hbaesIqmM5_KT8aBI8tNYeOHEzoVnkCbQIo8ly-7LGH3rkLI3-_MDaHsKCPiG3HiPwqKs4rRyo--NEa0UD-huhvyuO2nVM9eX2muHJ2JU7g5HMEOS6VhNHeoFdDynqe-J9QbJNCWYRyNWSFUERqyv2hiAc1RaSmrBXKek0pUlPuD6V4UFOK1JS1QolNkfWTGVw__8Jk_Q0aJc1K

            vrpolak Vratko Polak
            vrpolak Vratko Polak
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: