Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-8272

Bump Mesos requirement and remove resident task failure status surrogate that is no longer required

    Details

      Description

      Before Mesos 1.4.0, on reboot agents would come up with different IDs. This meant that unreachable tasks on the node would never become terminal, forever blocking the recovery of resident tasks. In MARATHON-2311 we created a mechanism to treat an offer with reservations for a task as an indication of the task being gone.

      That is fixed in Mesos 1.4.0 and I have confirmed that terminal statuses are now received for unreachable tasks when rebooted hosts come back online.

      Acceptance criteria

      Resident tasks are restarted for rebooted agents

      Given resident task A running on agent X
      If I stop the agent X
      Then the task should be reported as UNREACHABLE
      If I fail over the master, and then reboot agent X
      Then I should get a terminal task status update for task A when agent X comes back online

      Mesos minimum version is bumped to at least 1.4.0

      Given a Mesos master running 1.3.x
      When I start Marathon
      Then Marathon should crash after connecting to Mesos and discovering that it is not 1.4.0 or later

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tharper Tim Harper
                Reporter:
                tharper Tim Harper
                Team:
                Orchestration Team
                Watchers:
                Benjamin Mahler, Matthias Eichstedt, Meng Zhu, Tim Harper, Vinod Kone
              • Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: