-
Bug
-
Resolution: Unresolved
-
Medium
-
None
-
None
This is an old issue fixed some time ago, opening a ticket just so the information is gathered in one place that can be linked to.
Trending has shown that some vhost tests, which were previously quite stable, started to show two-band structure, not related to testbed. Closer examination has shown that the performance is determined by which interface is handled by which worker.
Vhost tests contain vswitch with two physical interfaces and two virtual interfaces. If there are two VPP workers (1core HT=on, or 2 cores HT=off), each worker reads from one physical and one virtual interface. If we imagine two bridge domains (even though some tests use l2xc), one worker either handles one domain in both direction, or one direction in two domains. For some reason, in vhost tests the two options lead to different performance. Not sure what mechanism explains the difference; it somehow depends on the app forwarding between the two virtual interfaces, as the performance difference was present only for VPP in VM, not for testpmd. Also, memif tests also show the two assignment possibilities, but in that case the performance is the same even for VPP in the container.
The reason the two-band structure appeared was caused by a migration from Python2 to Python3.
NodePath library used for computing path the traffic follows started giving non-deterministic results. We used sets of structured items, probably Python 3 uses memory address (which is not very deterministic) for hash.
Anyway, we fixed the issue by making the ordering deterministic (using lists instead of sets) [0], which at that time selected the more performant assignment (one bridge domain in both directions).