In case a test case fails because of an infrastructure issue, rather than destroying the VMs at the end of the test case, keep the VMs alive for further troubleshooting.
Ideally, an "infrastructure" issue could be detected by the following indications:
- all test cases failed,
- a consecutive list of test cases, from a certain (random) test case until the last test case, failed,
- >50% of test cases failed,
- test case failures are intermittent, i.e. when the test is repeated with the same candidate VPP build, test succeed
It may be difficult to identify such an "infrastructure" issue at this point, so an alternative implementation would be to keep VMs of *ALL" failed tests (where any critical tests failed), and allow a waiting period of 24-36 hours during which we can manually flag VMs to be kept forever, or else they will automatically be deleted after such timeout period only.