Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-8174

Remove timeout for root-group changes

    Details

    • Type: Task
    • Status: Resolved
    • Priority: Medium
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: RI-1, DC/OS 1.11.3
    • Component/s: None
    • Labels:

      Description

      Currently, Marathon will timeout all deployment creating requests after the zookeeper timeout setting (by default, 30 seconds). This is an issue when Marathon deployments get backed up; even though we timeout the HTTP requests, we still schedule the work to process that deployment.

      Most clients are likely to retry a deployment in the event of a timeout. In such a case, this would lead to even more load on an already burdened deployment queue.

      As such, we propose that server-side timeouts are removed for deployment creation. In order to do this, reasonably, we need to modify our HTTP layer to not block on threads while they deployment is potentially taking multiple minutes to return (in worst-case scenarios). Further, it will be on the customer to increase the timeout for proxying, and increase the read timeout for associated clients. Further, we should make the proxying logic asynchronous for similar reasons.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                tharper Tim Harper
                Reporter:
                tharper Tim Harper
                Team:
                Orchestration Team
                Watchers:
                Tim Harper
              • Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: