Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-1555

Unresponsive persistent searches on cn=changelog can block the whole server


    • Type: Bug
    • Status: Dev backlog
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 2.8.0
    • Fix Version/s: None
    • Component/s: replication
    • Labels:


      Matthew Swift:

      It is possible for the RS reader thread to become blocked as a result of psearch clients failing to keep up (e.g. CTRL-Z a ldapsearch psearch against cn=changelog). If an RS reader thread becomes blocked then the entire topology will be impacted.

      As long as we decouple write IO (psearches) from read IO (ServerReader) I don't mind what the solution is. I'd be happy if we just used a single thread, similar to the ChangeNumberIndexer, which polls for new changes and notifies cookie based psearches. In other words incoming updates are pushed to the replica DBs and:

      • the ChangeNumberIndexer is notified and possibly indexes the change, in which case it also notifies non-cookie based psearches
      • the "cookie psearch notification" thread is notified and notifies all cookies based psearches.

      If a psearch is too slow to consume results being sent to it, then it should be forcefully closed by the server. We should use a combination of socket timeout + verifying the trend for the backlog size.


          Issue Links



              • Assignee:
                JnRouvignac Jean-Noël Rouvignac
              • Votes:
                0 Vote for this issue
                0 Start watching this issue


                • Created: