Details

    • Sprint:
      Observability Team Sprint 37, Observability Team Sprint 38
    • Story Points:
      3

      Description

      COPS ticket: https://jira.mesosphere.com/browse/COPS-4333

      The metrics API occasionally (but consistently) returns 204s on the /containers endpoint, causing a "No metrics found" message on cli `dcos task metrics summary` commands.

      It looks like the timestamp on a metric coming from telegraf is not actually set. This timestamp field is used in dcos-metrics to determine the age of the metric, and dcos-metrics is currently configured to expire the cache every 2m. The "janitor" cleanup process is run every minute - because the timestamp is 0, it registers as the 1970 0 timestamp, which comes out to >2m - thus every run of janitor causes the cache to refresh, causing 204s every time this happens.

        Attachments

          Activity

            People

            • Assignee:
              gracedo Grace Do
              Reporter:
              gracedo Grace Do
              Team:
              Observability Team
              Watchers:
              Grace Do, Lisa Gunn, Mergebot, Sergei Vavilov
            • Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: