Uploaded image for project: 'DC/OS'
  1. DC/OS
  2. DCOS_OSS-3595

dcos-net Fails to Recurse Upstream Resolvers


    • Type: Bug
    • Status: Resolved
    • Priority: High
    • Resolution: Done
    • Affects Version/s: DC/OS 1.11.1
    • Fix Version/s: None
    • Component/s: dcos-net, spartan
    • Labels:


      We have two DC/OS clusters, one in each our of AWS accounts. We noticed that after upgrading to 1.11.0 in development and 1.11.1 in production that responses from our services were occasionally taking much longer (up to seconds) to respond. 

      After digging on this for a day one of our engineers found that DNS lookups to the upstream resolvers in the private subnets in the VPCs were not actually resolving every few seconds:

      [root@dcos-agent]# dig +retry=0 @ in a <redacted>.us-east-1.<redacted>.b5l
      ; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> +retry=0 @ in a <redacted>.us-east-1.<redacted>.b5l
      ; (1 server found)
      ;; global options: +cmd
      ;; connection timed out; no servers could be reached

      This happens for both the private zone in the VPC (request example above) as well as publicly resolvable domains:

      [root@dcos-agent centos]# dig +retry=0 @ in a google.com
      ; <<>> DiG 9.9.4-RedHat-9.9.4-29.el7_2.3 <<>> +retry=0 @ in a google.com
      ; (1 server found)
      ;; global options: +cmd
      ;; connection timed out; no servers could be reached

      I've verified our private resolvers are configured properly in dcos-dns.json:

        "upstream_resolvers": ["<redacted>", "<redacted>", "<redacted>"],
        "udp_port": 53,
        "tcp_port": 53,
        "bind_ip_blacklist": [""],
        "forward_zones": {}

      On my my leading master and the agent I ran the dig queries on, the systemd unit for dcos-net.service has no logs in the journal. I'm running the default logging configuration for DC/OS and I'm not sure where those logs are sent to by default. Those would be handy in debugging this further.




            • Assignee:
              sergeyurbanovich Sergey Urbanovich
              jeffmalnick Jeff Malnick
              Networking Team
              Deepak Goel, Jack Angers, Jeff Malnick, Judith Malnick (Inactive), Sergey Urbanovich
            • Watchers:
              5 Start watching this issue


              • Created: