Implement a set of commands that can be captured as a SUT and TG "health check", or "system status report", to be collected on each SUT and TG, perhaps routinely at the beginning and end of a test suite setup, and also after a failed test case.
This should include things such as
- attempting to ping/ssh to the TG and each SUT,
- capturing CPU, memory, disk status,
- capturing the last X lines of syslog,
- checking for any evidence of crashed processes (VPP or others),
- capturing output of which processes are currently running,
- any other log files deemed relevant.