Details

    • Type: Bug
    • Status: Resolved
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Mesos Integration
    • Labels:

      Description

      With the current setting marathon will retry to kill tasks after 5seconds if it hasn't received a confirmation in that timeframe (https://github.com/mesosphere/marathon/blob/a35f1c9f4f0dec7501de55efa72742510afe6d1c/src/main/scala/mesosphere/marathon/upgrade/StoppingBehavior.scala#L76),
      When killing a lot of tasks it is very unlikely that after 5 seconds all tasks are killed, and hence marathon will create a lot of additional load for the Mesos Master by retrying to kill the large number of outstanding tasks over and over again. Potentially we should try to make the retry adaptive to the number of outstanding tasks to be killed or even partition the tasks to be killed.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              joerg Jörg Schad (Inactive)
              Team:
              Orchestration Team
              Watchers:
              Matthias Eichstedt
            • Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: