We want to track the "unstable build loop"
for flaky tests so that we can act upon issues before they block Marathon builds.
A. The idea is to connect the "unstable build loop" to DataDog to analyse and aggregate the test. DataDog can define and observe thresholds of the flakiness of the test runs:
- the test results (on test method level)
- the time of the build
DataDog will create an alert and send a notification to PagerDuty / On-Call
B. Following https://github.com/mesosphere/marathon-jenkins-stats, we could directly export this data (see attached example TSV file) into Postgres
- Every test run (Jenkins Job Run) from the Jenkins "unstable build loop" will get exported