Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-6896

Asynchronously call sys.exit() to avoid deadlock due to the JVM shutdown hooks

    Details

    • Type: Task
    • Status: Resolved
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:

      Description

      I finally managed to figure out why when suicide() was called by disconnected() that Marathon (i.e. the JVM) would not exit and would require a kill -9 to get rid of the JVM process.

      The issue was that suicide() called sys.exit() which initiates the JVM shutdown sequence. Part of the JVM shutdown sequence is to run shutdown hooks. This method will block until the shutdown hooks have ran and are finished. One of the shutdown hooks is to shutdown Marathon via mesosphere.chaos.App trait (which is used in Main.scala). This stops all the services started with mesosphere.chaos.App.run(). In Main.scala two services are run; HttpService and MarathonSchedulerService. Thus, the shutdown hook calls MarathonSchedulerService.shutDown() which then calls MarathonSchedulerService.triggerShutdown(). You will notice that triggerShutdown() tries to shutdown the driver (i.e. MesosSchedulerDriver). But it can't stop the driver because the driver thread is currently blocked by sys.exit() in the suicide() method. And hence we have deadlock.

      The simple solution is to put the sys.exit() call in its own thread so that it doesn't block the driver thread. And thus everything shuts down nicely .

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              GitHub_marc-barry Marc Barry (Inactive)
              Team:
              Orchestration Team
              Watchers:
            • Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: