Details

      Description

      There is a COPS ticket where the UI and the CLI/API are different.   I'll add details below for why the differences... but this lead to understand of a bug when using pods and PV.   

      DCOS 1.11

      Here is the setup... it appears to happen on the update of a pod.    Meaning there is a deployed pod... then it is updated.   It doesn't matter if the original pod was PV or not... as long as the updated pod is PV.   I've created the ability to replicate with that in mind.

       

      step 1:  dcos marathon pod add works.json

      wait for deployment to finish (it is useful to watch the DCOS UI which illustrates the failure)

      step 2: (after UI indicates "running")  

      dcos marathon pod update sleep < works-3.json

      this is called works-3.json because in isolation (not as an update) this pod def would work... as part of an update... here is what you should see:

      1. DCOS UI for service shows in deployment forever
      2. Mesos UI often shows 3 deployments for 1 (this is inconsistent)
      3. curl DCOS at `service/marathon/v2/groups?_timestamp=1517957446428&embed=group.groups&embed=group.apps&embed=group.pods&embed=group.apps.deployments&embed=group.apps.counts&embed=group.apps.tasks&embed=group.apps.taskStats&embed=group.apps.lastTaskFailure` . you will get tasks in Pending
      4. dcos marathon pod show sleep       - will also show pending tasks

       

      Attached the jsons

      Attached image of mesos ui with 6 tasks (3 deployments of 1 pod and 2 containers... it should be 2 tasks)

      Attached marathon.log.gz . (which is a log from the start of marathon)

      Attached a marathon2.log which is a log of marathon for just the last window of testing.

       

      The UI uses the the /v2/group endpoint with embedded elements to determine if a pod is fully deployed.   v2/groups?_timestamp=1517957446428&embed=group.groups&embed=group.apps&embed=group.pods&embed=group.apps.deployments&embed=group.apps.counts&embed=group.apps.tasks&embed=group.apps.taskStats&embed=group.apps.lastTaskFailure

      this should return tasks as running.  It seems related... if we need a second JIRA let me know.

       

        Attachments

        1. marathon.log.gz
          374 kB
        2. marathon2.log
          576 kB
        3. Screen Shot 2018-02-06 at 5.19.16 PM.png
          Screen Shot 2018-02-06 at 5.19.16 PM.png
          145 kB
        4. works.json
          0.9 kB
        5. works-3.json
          1 kB

          Activity

            People

            • Assignee:
              ivanchernetsky Ivan Chernetsky
              Reporter:
              ken Ken Sipe
              Team:
              Orchestration Team
              Watchers:
              Ivan Chernetsky, Ken Sipe, Matthias Eichstedt
            • Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: