Details

    • Sprint:
      Marathon Sprint 1.10-3, Marathon Sprint 1.10-4, Marathon Sprint 1.10-5, Marathon Sprint 1.10-6, Marathon Sprint 1.10-7, Marathon Sprint 1.10-8
    • Story Points:
      3

      Description

      Now that we've migrated to curator's LeaderLatch for leader election, it seems that our leader election code is doing WAY more than necessary. Some things I would like to see fixed:

      • remove all locks. As an example, LeaderLatch blocks and this led to a dead lock. With the amount of thread blocking we're doing, it should be no surprise.
      • AnythingBase is almost always an anti-pattern. Get rid of ElectionServiceBase. If we need a common interface, define it. Prefer to put common behavior in helper methods rather than inherit them,e tc.
      • Perform blocking IO in a thread pool designed for blocking IO. Don't block the global fork join pool.
      • Rip out state-machine-ish logic. When you get offer leadership, great, you're the leader. If you lose, then stand-by. If you transition from having leadership to not having leadership, CRASH. Curator's leader latch handles perpetually trying to obtain leadership in the event that the leader goes missing. It does not need to be "restarted" periodically.

      The falling scala script illustrates how concise our leader election module should be:

      https://gist.github.com/timcharper/22a1bca65e9a8268225dcfb97420cdf7

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ivanchernetsky Ivan Chernetsky
                Reporter:
                tharper Tim Harper
                Team:
                Orchestration Team
                Watchers:
                Jason Gilanfarr (Inactive), Matthias Eichstedt, tgermain, Viktor Harutyunyan
              • Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: