Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-6561

UndeliverableException + NPE while using dsconfig on RS

    Details

    • Type: Bug
    • Status: Done
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.0.0
    • Fix Version/s: 7.0.0
    • Component/s: config, replication
    • Labels:

      Description

      Found with rev 4f2c5b9c0e0841a55b45adb09bd6c3e080621e31

      We have a stress tests that installs a split DS RS topology.
      Then it runs modrate on both DS.
      At the end, after everything is stable on changelog sides, we try to modify the purge delay on each RS. It works fine for the first one but for the second:

      /external/testuser/jenkins/workspace/OpenDJ-7.0.x/replication/replication_split_dsrs/results/20190821-130148/replication_split_DSRS/Modify/RS1/opendj/bin/dsconfig -h comte.internal.forgerock.com -p 4445 -D "uid=admin" -w password -X set-replication-server-prop --provider-name "Multimaster Synchronization" --set "replication-purge-delay:120s" -n
      
      /external/testuser/jenkins/workspace/OpenDJ-7.0.x/replication/replication_split_dsrs/results/20190821-130148/replication_split_DSRS/Modify/RS2/opendj/bin/dsconfig -h comte.internal.forgerock.com -p 4447 -D "uid=admin" -w password -X set-replication-server-prop --provider-name "Multimaster Synchronization" --set "replication-purge-delay:120s" -n 	
      WARN 	ERROR:
      -- rc --
      returned 80, expected to be in [0]
      
      
      The Replication Server could not be modified due to a communications problem:
      Undefined
      
      

      In this second RS log:

      (ERROR) [21/Aug/2019:14:15:15 +0200] category=org.opends.messages.external severity=WARNING msgID=1 msg=GRIZZLY0011: Uncaught exception on thread Thread[Administration Connector 0.0.0.0:4447(1) SelectorRunner,5,main] exception=UndeliverableException: Other: The Directory Server encountered an unexpected error while attempting to add the client request to the work queue: NullPointerException(Topology.java:222) (RxJavaPlugins.java:366 FlowableCreate.java:271 LdapClientConnection.java:747 LdapClientConnection.java:709 LdapClientConnection.java:565 ModifyRequestImpl.java:54 LdapClientConnection.java:565 LdapClientConnection.java:545 LdapClientConnection.java:508 FlowableCreate.java:72 Flowable.java:13234 FlowableDoOnLifecycle.java:38 Flowable.java:13234 Flowable.java:13183 FlowableLift.java:49 Flowable.java:13234 FlowableOnErrorNext.java:39 Flowable.java:13234 Flowable.java:13180 FlowableLift.java:49 Flowable.java:13234 ...)
      

      This happens regularly with this test:

      ./run-pybot.py -v -c stress -s replication_split_DSRS OpenDJ
      

      on lab machine, with the following configuration:

      def cfg = config.stressConfig() + [
              OPENDJ_VERSION: "7.0.0-SNAPSHOT",
              STRESS_DURATION: "3600",
              STRESS_NUM_USERS: "100000",
              //SLACK_CHANNEL: "#engineering-ds-notifs",
          ]
          cfg["TIMEOUT"] = (cfg["STRESS_DURATION"] as Integer) * 3
          cfg["TIMEOUT_UNIT"] = "SECONDS"
      

      see http://jenkins-fr.internal.forgerock.com:8080/job/OpenDJ-7.0.x/job/replication/job/replication_split_dsrs/

        Attachments

          Activity

            People

            • Assignee:
              fabiop Fabio Pistolesi
              Reporter:
              cforel carole forel
              Dev Assignee:
              Fabio Pistolesi
              QA Assignee:
              carole forel
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: