Uploaded image for project: 'Marathon'
  1. Marathon
  2. MARATHON-7989

Validation errors on root group in large clusters can Marathon to crash

    Details

      Description

      If an invalid app definition is posted such that it would cause the root group to be invalid (for example, an application's ID matches an existing group), this causes the entire root group to be rendered as a string to the logs. The reason is we log the validation failure to the logs, and this validation failure references the root group, and so forth.

      When the root group is large, this will cause Marathon to crash due to resource starvation causing a timeout with the zookeeper master.

      Acceptance criteria:

      As a user,
      when I post an invalid app definition that collides with an existing group,
      then the log error message should output only a summary of the root group (and be less than several hundred bytes)

        Attachments

          Activity

            People

            • Assignee:
              tharper Tim Harper
              Reporter:
              tharper Tim Harper
              Team:
              Orchestration Team
              Watchers:
            • Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: