Changes to reduce packet loss:
- Delayed/reference counted call to vlib_worker_thread_node_runtime_update()
- Allocation of clone nodes as single block instead of multiple separate allocation
(These features ended up being merged along with the code for VPP-970.)