Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-8272

Bump Mesos requirement and remove resident task failure status surrogate that is no longer required



      Before Mesos 1.4.0, on reboot agents would come up with different IDs. This meant that unreachable tasks on the node would never become terminal, forever blocking the recovery of resident tasks. In MARATHON-2311 we created a mechanism to treat an offer with reservations for a task as an indication of the task being gone.

      That is fixed in Mesos 1.4.0 and I have confirmed that terminal statuses are now received for unreachable tasks when rebooted hosts come back online.

      Acceptance criteria

      Resident tasks are restarted for rebooted agents

      Given resident task A running on agent X
      If I stop the agent X
      Then the task should be reported as UNREACHABLE
      If I fail over the master, and then reboot agent X
      Then I should get a terminal task status update for task A when agent X comes back online

      Mesos minimum version is bumped to at least 1.4.0

      Given a Mesos master running 1.3.x
      When I start Marathon
      Then Marathon should crash after connecting to Mesos and discovering that it is not 1.4.0 or later


          Issue Links



              • Assignee:
                tharper Tim Harper
                tharper Tim Harper
                Orchestration Team
                Benjamin Mahler, Matthias Eichstedt, Meng Zhu, Tim Harper, Vinod Kone
              • Watchers:
                5 Start watching this issue


                • Created: