Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-1611

File-based changelog: worker threads blocked in pendingChanges

    Details

    • Type: Bug
    • Status: Done
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.8.0
    • Fix Version/s: 3.0.0, 2.8.0
    • Component/s: replication
    • Labels:
      None

      Description

      Found using trunk (2.7.0 revision 11150).

      Topology is 4 servers (2 DS m1/m4 and 2 RS m2/m3) in which each RS use the file-based changelog.

      In this scenario we execute 2 clients doing modify operations against each DS (client1 => m1 and client2 => m4).
      At the beginning the throughput for each client is around 8000 mod/s.
      But after few minutes the throughput of one client decrease.

      ...
      Mod rate = 8448.4/s
      Mod rate = 8531.0/s
      Mod rate = 4600.6/s
      Mod rate = 0.0/s
      Mod rate = 9.6/s
      Mod rate = 23.0/s
      Mod rate = 0.0/s
      Mod rate = 60.4/s
      Mod rate = 23.8/s
      ...
      

      Looking at jstack on the corresponding DS, some threads are BLOCKED.

      "Worker Thread 48" #185 prio=5 os_prio=0 tid=0x00007fb4f9780000 nid=0x8537 waiting for monitor entry [0x00007fb3a94d3000]
         java.lang.Thread.State: BLOCKED (on object monitor)
      	at org.opends.server.replication.plugin.PendingChanges.putLocalOperation(PendingChanges.java:132)
      	- waiting to lock <0x00000000e1f377f8> (a org.opends.server.replication.plugin.PendingChanges)
      	at org.opends.server.replication.plugin.LDAPReplicationDomain.generateCSN(LDAPReplicationDomain.java:2546)
      	at org.opends.server.replication.plugin.LDAPReplicationDomain.handleConflictResolution(LDAPReplicationDomain.java:1956)
      	at org.opends.server.replication.plugin.MultimasterReplication.handleConflictResolution(MultimasterReplication.java:436)
      	at org.opends.server.workflowelement.localbackend.LocalBackendModifyOperation.handleConflictResolution(LocalBackendModifyOperation.java:1916)
      	at org.opends.server.workflowelement.localbackend.LocalBackendModifyOperation.processModify(LocalBackendModifyOperation.java:432)
      	at org.opends.server.workflowelement.localbackend.LocalBackendModifyOperation.processLocalModify(LocalBackendModifyOperation.java:269)
      	at org.opends.server.workflowelement.localbackend.LocalBackendWorkflowElement.execute(LocalBackendWorkflowElement.java:671)
      	at org.opends.server.core.WorkflowImpl.execute(WorkflowImpl.java:196)
      	at org.opends.server.core.WorkflowTopologyNode.execute(WorkflowTopologyNode.java:99)
      	at org.opends.server.core.ModifyOperationBasis.run(ModifyOperationBasis.java:393)
      	at org.opends.server.extensions.TraditionalWorkerThread.run(TraditionalWorkerThread.java:164)
      

      => see jstack in attachment

      Moreover in the errors log of each RS, I noticed the following message:

      [28/Oct/2014:11:24:08 +0100] category=SYNC severity=NOTICE msgID=-1 msg=Rejecting append to log '/local/testuser/Installs/m3/changelogDb/2.domain/13866.server' for record: [Record [0000014956461afd362a00026827:ModifyMsg content:  protocolVersion: 8 dn: uid=user_4920,dc=europe,dc=com csn: 0000014956461afd362a00026827 uniqueId: d403fe01-9bc7-3602-8fbf-81fdaf6657fe assuredFlag: false assuredMode: SAFE_DATA_MODE safeDataLevel: 1 size: 179]], last key appended: [000001495648243c362a000d4353]
      

      => more than 10000 times in 10 minutes

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                nicolas.capponi@forgerock.com Nicolas Capponi
                Reporter:
                csovant Christophe Sovant
                Dev Assignee:
                Nicolas Capponi
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: