Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-4665

Show precise latency measurements for Admin Router responses in Grafana


    • Type: Task
    • Status: Resolved
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: DC/OS 1.13.0
    • Component/s: adminrouter, metrics
    • Labels:


      After DCOS_OSS-4596 we will have measurements of latency of Admin Router responses available in Grafana.
      These measurements use an arithmetic mean of the latency over the last one minute, provided by nginx-module-vts.
      This will not show us what the real latency is for requests, and how that changes within a more precise time frame than one minute.

      For sure we want to reduce the scraping interval, to get more precise measurement.
      We must be careful not to make the scrape interval too small, else we will move on to the next attempt to scrape data before data has been scraped.
      Be aware that we are collecting more data, and optionally implement some cleanup/rotation if that is necessary.

      We should choose also what to do with that data:

      • Stick with the arithmetic mean system provided by nginx-module-vts
      • Change to a weighted moving average, provided by nginx-module-vts
      • Do our own maths to show a different metric to the arithmetic mean or weighted moving average
      Acceptance Criteria
      • Export prometheus histograms that displays high latency
      • Metrics are exported and displayed in corresponding dashboards


          Issue Links



              • Assignee:
                jonathangiddy Jonathan Giddy
                adamdangoor Adam Dangoor
                Security Team
                Adam Dangoor, Jonathan Giddy, Mergebot, Tim Weidner
              • Watchers:
                4 Start watching this issue


                • Created: