Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-4761

Telegraf drops statsd metrics at scale

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: High
    • Resolution: Done
    • Affects Version/s: DC/OS 1.13
    • Fix Version/s: DC/OS 1.13.0
    • Component/s: dcos-telegraf
    • Labels:
      None
    • Sprint:
      Observability Team Sprint 40
    • Story Points:
      1

      Description

      Telegraf's statsd plugin on a leading DC/OS master dropped incoming UDP messages during a scale test:

      E! Error: statsd message queue full. We have dropped 70000 messages so far. You may want to increase allowed_pending_messages in the config

      allowed_pending_messages is currently set to 10000. We should increase this value to something that's less likely to causeĀ dropped metrics at scale. Increasing the value to 100000 resulted in no more dropped metrics, and did not appear to significantly increase Telegraf's memory consumption.

      We should also make this change for the dcos_statsd input plugin. It currently hardcodes allowed_pending_messages to 10000.

        Attachments

          Activity

            People

            • Assignee:
              branden Branden Rolston
              Reporter:
              branden Branden Rolston
              Team:
              Observability Team
              Watchers:
              Branden Rolston, Mergebot
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: