Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-6897

Backport OPENDJ-6778: Proxy server mishandles abandon requests

    Details

    • Type: Bug
    • Status: Done
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 6.5.0, 7.0.0
    • Fix Version/s: 6.5.3
    • Component/s: proxy
    • Story Points:
      0.5

      Description

      When a proxy server receives an abandon request, it appears to end up leaving one worker thread blocking on some sort of result. However, abandon requests do not have any results, so as a consequence the worker thread blocks for ever.

      If enough abandon requests are received all the proxy's worker threads will get blocked, and the work queue will fill up as there are no threads to service it.

      Steps to reproduce

      Create 2 replicating DJ servers with 2000 test users under dc=example,dc=com. Add a proxy user, and update the ACI:

      dn: dc=example,dc=com
      changetype: modify
      add: aci
      aci: (targetattr="*")(version 3.0; acl "Allow apps proxied auth"; allow(all, proxy)(userdn = "ldap:///cn=*,ou=Apps,dc=example,dc=com");)
      aci: (targetcontrol="2.16.840.1.113730.3.4.3")(version 3.0; acl "allow psearch"; allow(read)(userdn="ldap:///uid=user.0,ou=people,dc=example,dc=com");)
      
      dn: ou=Apps,dc=example,dc=com
      changetype: add
      objectClass: organizationalunit
      objectClass: top
      ou: Apps
      
      dn: cn=Proxy,ou=Apps,dc=example,dc=com
      changetype: add
      objectClass: top
      objectClass: applicationProcess
      objectClass: simpleSecurityObject
      cn: Proxy
      userPassword: password
      ds-privilege-name: proxied-auth
      

      Create a proxy server using these two replicas. Update the proxy's global access control policy so that all authenticated users can use the psearch control:

      dsconfig set-global-access-control-policy-prop \
                --policy-name Authenticated\ access\ all\ entries \
                --add allowed-control:\* \
                --hostname localhost \
                --port 4444 \
                --bindDn cn=Directory\ Manager \
                --trustAll \
                --bindPassword password \
                --no-prompt
      

      Run the code in [^App.java]several times, which binds as user.0, does a psearch, and abandons the psearch. (Builds with the 6.5.0 SDK)

      Get a jstack of the proxy server, and inspect the Worker Threads.

      Search the cn=work queue,cn=monitor entry of the proxy server using the admin connector:

      bin/ldapsearch -h localhost -p 4444 -Z -X -D "cn=directory manager" -w password -b "cn=work queue,cn=monitor" -s base "(&)"
      

      Expected results

      All the worker threads will be TIMED_WAITING - waiting for more work. The work queue will have ds-mon-requests-in-queue: 0

      Actual results

      Some (or all) worker threads will be WAITING, and stuck inside AbandonOperation.run()

      "Worker Thread 2" #45 prio=5 os_prio=31 tid=0x00007f8dec229800 nid=0x6707 waiting on condition [0x000070000fafe000]
         java.lang.Thread.State: WAITING (parking)
      	at sun.misc.Unsafe.park(Native Method)
      	- parking to wait for  <0x000000076d9751a8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
      	at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
      	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
      	at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
      	at io.reactivex.internal.operators.flowable.FlowableBlockingSubscribe.subscribe(FlowableBlockingSubscribe.java:61)
      	at io.reactivex.internal.operators.flowable.FlowableBlockingSubscribe.subscribe(FlowableBlockingSubscribe.java:109)
      	at io.reactivex.Flowable.blockingSubscribe(Flowable.java:5805)
      	at org.opends.server.core.AbandonOperation.run(AbandonOperation.java:132)
      	at org.opends.server.extensions.TraditionalWorkQueue$WorkerThread.run(TraditionalWorkQueue.java:462)
      

      The proxy server's work queue will have a non-zero number of requests.

      Sometimes a test run will leave some proxy server worker threads blocked but one alive, but running the code a couple of times will usually be enough to thoroughly wedge the server. Stopping the proxy will log for each blocked worker thread:

      [08/Nov/2019:13:47:26 +0000] category=CORE severity=ERROR msgID=140 msg=An uncaught exception during processing for thread Worker Thread 2 has caused it to terminate abnormally. The stack trace for that exception is: OnErrorNotImplementedException (Functions.java:704 Functions.java:701 LambdaSubscriber.java:79 FlowableBlockingSubscribe.java:73 FlowableBlockingSubscribe.java:109 Flowable.java:5805 AbandonOperation.java:132 TraditionalWorkQueue.java:462)
      [08/Nov/2019:13:47:26 +0000] category=CORE severity=NOTICE msgID=139 msg=The Directory Server has sent an alert notification generated by class org.opends.server.api.DirectoryThread (alert type org.opends.server.UncaughtException, alert ID org.opends.messages.core-140): An uncaught exception during processing for thread Worker Thread 2 has caused it to terminate abnormally. The stack trace for that exception is: OnErrorNotImplementedException (Functions.java:704 Functions.java:701 LambdaSubscriber.java:79 FlowableBlockingSubscribe.java:73 FlowableBlockingSubscribe.java:109 Flowable.java:5805 AbandonOperation.java:132 TraditionalWorkQueue.java:462)
      

      It would be interesting to see if omitting the psearch and just abandoning some random message ID will also cause the problem.

      Note that the fix for OPENDJ-6512 does not resolve this problem.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                cjr Chris Ridd
                Reporter:
                cjr Chris Ridd
                Dev Assignee:
                Cedric Tran-Xuan
                QA Assignee:
                carole forel
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: