• Type: Task
    • Status: Resolved
    • Priority: High
    • Resolution: Done
    • Affects Version/s: DC/OS 1.10.0
    • Fix Version/s: DC/OS 1.10.0
    • Component/s: dcos-vagrant
    • Labels:


      DCOS_OSS-1467 revealed that there is a race in how dcos-vagrant defines resources for agent nodes resulting in tests flakiness. Quoting the issue:

      * `Tests pass` scenario:
        * port definitions are initially defined in /opt/mesosphere/etc/mesos-slave-public
        * the script honours MESOS_RESOURCES variable settings from previous files/does not override them ( so the ports definitions are left intact
        *, when run AFTER, also honours MESOS_RESOURCES settings and does not override them ( - memory resources definition is stored in /var/lib/dcos/mesos-resources
        * port definitions are preserved!
      * `Tests fail` scenario:
        * port definitions are initially defined in /opt/mesosphere/etc/mesos-slave-public
        *, when run BEFORE, override settings from /opt/mesosphere/etc/mesos-slave-public and creates /var/lib/dcos/mesos-slave-common file
        * /var/lib/dcos/mesos-slave-common has the highest priority when script and it does not contain port definitions from /opt/mesosphere/etc/mesos-slave-public
        * just appends disk resources to the data from /var/lib/dcos/mesos-slave-common file and creates /var/lib/dcos/mesos-resources file which now has the highest priority
        * the resulting file does not contain proper port definitions

      We need to make sure that no matter when the `` runs (before or after, it does not overwrite port definitions from the /opt/mesosphere/etc/mesos-slave-public. So far this was being done by waiting for `dcos-diagnostics --check` to finish, unfortunatelly this can no longer be the case. During the debug of the DCOS_OSS-1467 it was reveald that `dcos-diagnostic` can give false positives (return with non-zero exit code), even though the installation process has not finished. This seems like a bug and will be dealt with in a separate issue.

      Please let me know if I managed to proovide sufficient information in order to solve this issue. If not - do not hestitate to drop me a line or check DCOS_OSS-1467 for more details.



          Issue Links



              • Assignee:
                karl Karl Isenberg (Inactive)
                prozlach Pawel Rozlach
                Jan-Philip Gehrcke, Karl Isenberg (Inactive), Pawel Rozlach
              • Watchers:
                3 Start watching this issue


                • Created: