Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-4091

Telegraf doesn't free up TCP port 61091

    Details

      Description

      It looks like Telegraf may be failing to release TCP port 61091 when it crashes, preventing it from restarting when it fails to reclaim the port.

      From https://jira.mesosphere.com/browse/SOAK-119?focusedCommentId=175549&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-175549:

      Some info that could be useful dating from this morning:

      /opt/mesosphere/bin/telegraf --config /opt/mesosphere/etc/telegraf/telegraf.conf --config-directory /opt/mesosphere/etc/telegraf/telegraf.d/
      2018-09-10T11:59:39Z I! Starting Telegraf v1.7.0~ccb5eb8c
      2018-09-10T11:59:39Z I! Loaded inputs: inputs.disk inputs.swap inputs.net inputs.processes inputs.system inputs.cpu inputs.mem inputs.dcos_containers inputs.dcos_statsd
      2018-09-10T11:59:39Z I! Loaded aggregators:
      2018-09-10T11:59:39Z I! Loaded processors: dcos_metadata
      2018-09-10T11:59:39Z I! Loaded outputs: prometheus_client dcos_metrics
      2018-09-10T11:59:39Z I! Tags enabled: dcos_cluster_id=$DCOS_CLUSTER_ID dcos_cluster_name=soak112s host=int-mountvolumeagent1-soak112s.testing.mesosphe.re
      2018-09-10T11:59:39Z I! Agent Config: Interval:10s, Quiet:false, Hostname:"int-mountvolumeagent1-soak112s.testing.mesosphe.re", Flush Interval:10s
      INFO[0000] Starting HTTP producer garbage collection service  producer=http
      INFO[0000] http producer serving requests on tcp socket: :0  producer=http
      2018-09-10T11:59:39Z E! Error creating prometheus metric endpoint, err: listen tcp :61091: bind: address already in use
      

      And

      $ netstat -tulpn | grep 61091
      tcp6       0      0 :::61091                :::*                    LISTEN      9772/telegraf
      

      We can see that this port is hardcoded here: https://github.com/dcos/dcos/blob/225b085d6097a29784ee244fd92b1d360d072ab4/gen/dcos-config.yaml#L1321

        Attachments

          Activity

            People

            • Assignee:
              philip Philip Norman
              Reporter:
              branden Branden Rolston
              Team:
              Cluster Ops Team
              Watchers:
              Branden Rolston, Daniel Baker, Lee Hambley (Inactive), Philip Norman
            • Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: