Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-8314

Better document the dangers of canceling deployments

    Details

    • Type: Task
    • Status: Resolved
    • Priority: High
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: DC/OS 1.12.0
    • Component/s: Docs
    • Labels:

      Description

      Scope of this ticket is to better document unexpected edge cases that occur when canceling deployments.

      Please see COPS-3012 for more info and discussion and examples. Description from that customer issue:

      Customer is deploying sets of Marathon apps via group deployment, where one or ore of the apps will depend on other apps.

      Sometimes, one of the dependencies will fail (e.g., invalid image tag), so they have to go in and manually suspend/fix/start the dependency.

      When they do this, the dependent app will say it is updated, but will not actually update.

      For example, deploying the following:

      /a/1, using image alpine:3.3
      /a/2, using image alpine:3.3
      

      where /a/2 is dependent on /a/1.

      Now, try do a group update with the following

      /a/1, using image alpine:invalid
      /a/2, using image alpine:3.6
      

      (again, with the same dependency).

      /a/1/ will fail, as expected, and /a/2 will hang, as expected.

      However, if you suspend /a/1 (in order to fix it, for example), then /a/2 will be displayed as if the upgrade to the target version has completed. This shows up both in the API and in the DC/OS and Marathon UIs. If /a/1 is fixed and redeployed, /a/2 will not be restarted because Marathon assumes the target version is already fulfilled. This can only be worked around by e.g. adding a label to /a/2 in order to have Marathon restart all tasks.

        Attachments

          Activity

            People

            • Assignee:
              nikitamelkozerov Nikita Melkozerov (Inactive)
              Reporter:
              matthias.eichstedt Matthias Eichstedt
              Team:
              Orchestration Team
              Watchers:
              Aleksey Dukhovniy, Justin Lee (Inactive), Matthias Eichstedt
              Reviewers:
              Matthias Eichstedt, Nikita Melkozerov (Inactive)
            • Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: