Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-1988

Unable to re-deploy when constraint is unfulfilled

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Medium
    • Resolution: Cannot Reproduce
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Deployments
    • Labels:

      Description

      Similar to MGI-2810.

      My use case is to provide an edge router on all nodes. To do this I use a UNIQUE hostname constraint and set the number of instances to the number of nodes.

      In one of my tests I only use a single VM, but leave the constraint at 3 (for example). If one of the containers die during deployment, they will not be restarted until the constraint is satisfied. I.e. never.

      To recreate, create a Meos/marathon (tested with 15.3), with fewer than three worker nodes. Use the following json below:

      {
        "id": "/example",
        "groups": [
          {
            "id": "/example/constraint",
            "apps": [
              {
                "id": "/example/constraint/test",
                "cpus": 0.1,
                "mem": 128.0,
                "cmd": "sleep 9999999",
                "container": {
                  "type": "DOCKER",
                  "docker": {
                    "image": "alpine",
                    "network": "BRIDGE"
                  }
                },
                "instances": 3,
                "constraints": [
                  [
                    "hostname",
                    "UNIQUE"
                  ]
                ]
              },
              {
                "id": "/example/constraint/fail",
                "cpus": 0.1,
                "mem": 128.0,
                "cmd": "sleep 1 ; exit 0",
                "container": {
                  "type": "DOCKER",
                  "docker": {
                    "image": "alpine",
                    "network": "BRIDGE"
                  }
                }
              }
            ]
          }
        ]
      }
      

      Then create the app with POST /v2/groups.

      Then drill down to view the application. You will see that the "fail" task is stuck waiting to be deployed, despite the fact that there is nothing stopping it from being redeployed.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              philwinder philwinder
              Team:
              Orchestration Team
              Watchers:
              Jason Gilanfarr (Inactive), Matthias Eichstedt
            • Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: