Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-3224

Infinite loop reading replication changelog if a CSN appears more than once

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Won't Fix
    • Affects Version/s: 3.0.0
    • Fix Version/s: 6.0.0
    • Component/s: replication
    • Labels:
    • Support Ticket IDs:

      Description

      If the same CSN appears twice in a replication changelog, the server can go into a state where it can never finish reading over the changelog.

      A typical stacktrace on 3.5.0 would look like this:

      "Replication server RS(14047) writing to Replication server RS(32308) for domain "cn=admin data" at opendj2.example.com/192.168.56.4:28989" prio=10 tid=0x00007fce6401a000 nid=0x5ac6 runnable [0x00007fcf5cecd000]
         java.lang.Thread.State: RUNNABLE
          at java.io.RandomAccessFile.readBytes0(Native Method)
          at java.io.RandomAccessFile.readBytes(RandomAccessFile.java:350)
          at java.io.RandomAccessFile.read(RandomAccessFile.java:385)
          at java.io.RandomAccessFile.readFully(RandomAccessFile.java:444)
          at java.io.RandomAccessFile.readFully(RandomAccessFile.java:424)
          at org.opends.server.replication.server.changelog.file.BlockLogReader.positionToRecordFromBlockStart(BlockLogReader.java:269)
          at org.opends.server.replication.server.changelog.file.BlockLogReader.readRecord(BlockLogReader.java:242)
          at org.opends.server.replication.server.changelog.file.BlockLogReader.searchClosestBlockStartToKey(BlockLogReader.java:399)
          at org.opends.server.replication.server.changelog.file.BlockLogReader.seekToRecord(BlockLogReader.java:158)
          at org.opends.server.replication.server.changelog.file.LogFile$LogFileCursor.positionTo(LogFile.java:634)
          at org.opends.server.replication.server.changelog.file.Log$InternalLogCursor.positionTo(Log.java:1286)
          at org.opends.server.replication.server.changelog.file.Log$AbortableLogCursor.positionTo(Log.java:1537)
          at org.opends.server.replication.server.changelog.file.FileReplicaDBCursor.nextWhenCursorIsExhaustedOrNotCorrectlyPositionned(FileReplicaDBCursor.java:117)
          at org.opends.server.replication.server.changelog.file.FileReplicaDBCursor.next(FileReplicaDBCursor.java:111)
          at org.opends.server.replication.server.changelog.file.ReplicaCursor.next(ReplicaCursor.java:111)
          at org.opends.server.replication.server.changelog.file.CompositeDBCursor.addCursor(CompositeDBCursor.java:170)
          at org.opends.server.replication.server.changelog.file.CompositeDBCursor.next(CompositeDBCursor.java:110)
          at org.opends.server.replication.server.changelog.file.DomainDBCursor.next(DomainDBCursor.java:32)
          at org.opends.server.replication.server.MessageHandler.fillLateQueue(MessageHandler.java:363)
          at org.opends.server.replication.server.MessageHandler.getNextMessage(MessageHandler.java:261)
          at org.opends.server.replication.server.ServerHandler.take(ServerHandler.java:927)
          at org.opends.server.replication.server.ServerWriter.run(ServerWriter.java:94)
      

      I believe the main underlying issue is that it should not be possible to have the same CSN duplicated in one changelog file. I do not currently have any methodology to reproduce this aspect of the problem, however it has been observed in at least one production environment.

      The infinite loop can be reproduced by manually hex editing a changelog file, updating a CSN to match the previous CSN, then starting the server.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                ian.packer Ian Packer [X] (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: