-
Bug
-
Resolution: Done
-
Medium
-
None
-
None
-
None
In Contiv-VPP (VPP 19.04.1, but the same issue is present in 19.01), we use endpoint dependent NAT, as described below. It works fine in single-thread mode, but does not work in multi-thread mode.
The issue can be easily reproduced by enabling multi-threading in Contiv-VPP, e.g. by adding this into the VPP startup config:
cpu { main-core 0 corelist-workers 1-3 }
With this config, CoreDNS pods (deployed automatically, part of the k8s control plane) cannot communicate with k8s API (virtual IP 10.96.0.1 NATed on VPP). Even more, the vswitch VPP crashes after enabaling packet trace:
$ kubectl get pods --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES kube-system contiv-vswitch-n7b67 1/1 Running 2 4m1s 10.0.2.15 lubuntu <none> <none> kube-system coredns-fb8b8dccf-pppkm 0/1 CrashLoopBackOff 4 4m1s 10.1.1.2 lubuntu <none> <none> kube-system coredns-fb8b8dccf-wfjt6 0/1 CrashLoopBackOff 3 4m1s 10.1.1.3 lubuntu <none> <none>
Error log from coreDNS:
E0605 08:46:10.086500 1 reflector.go:134] github.com/coredns/coredns/plugin/kubernetes/controller.go:322: Failed to list *v1.Namespace: Get https://10.96.0.1:443/api/v1/namespaces?limit=500&resourceVersion=0: dial tcp 10.96.0.1:443: i/o timeout
The same works fine in single-thread mode.
NAT config on VPP:
vpp# sh nat44 interfaces NAT44 interfaces: tap0 in out loop2 in out tap1 in out tap2 in out vpp# sh nat44 static mappings NAT44 static mappings: tcp local 10.0.2.15:6443 external 10.96.0.1:443 vrf 0 self-twice-nat out2in-only tcp local 10.0.2.15:12379 external 192.168.16.1:32379 vrf 0 self-twice-nat out2in-only tcp local 10.0.2.15:12379 external 10.0.2.15:32379 vrf 0 self-twice-nat out2in-only tcp local 10.0.2.15:12379 external 10.111.130.0:12379 vrf 0 self-twice-nat out2in-only tcp external 10.96.0.10:9153 self-twice-nat out2in-only local 10.1.1.2:9153 vrf 1 probability 1 local 10.1.1.3:9153 vrf 1 probability 1 tcp external 10.96.0.10:53 self-twice-nat out2in-only local 10.1.1.2:53 vrf 1 probability 1 local 10.1.1.3:53 vrf 1 probability 1 udp external 10.96.0.10:53 self-twice-nat out2in-only local 10.1.1.2:53 vrf 1 probability 1 local 10.1.1.3:53 vrf 1 probability 1 vpp# sh nat44 addresses NAT44 pool addresses: NAT44 twice-nat pool addresses: 10.1.1.254 tenant VRF independent 0 busy udp ports 0 busy tcp ports 0 busy icmp ports vpp# sh inter addr local0 (dn): loop0 (up): L3 192.168.16.1/24 loop1 (up): L3 10.1.1.1/24 ip4 table-id 1 fib-idx 1 loop2 (up): L2 bridge bd-id 1 idx 1 shg 1 bvi L3 192.168.30.1/24 ip4 table-id 1 fib-idx 1 tap0 (up): L3 172.30.1.1/24 tap1 (up): unnumbered, use loop1 L3 10.1.1.1/24 ip4 table-id 1 fib-idx 1 tap2 (up): unnumbered, use loop1 L3 10.1.1.1/24 ip4 table-id 1 fib-idx 1
Startup config:
nat { endpoint-dependent translation hash buckets 1048576 translation hash memory 268435456 user hash buckets 1024 max translations per user 10000 }