Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-4185

Changelog not populated with new changes if an RS+DS goes down and replication fails to catch up when it's restarted

    Details

    • Support Ticket IDs:

      Description

      The changelogDb is not populated if the only other RS+DS is lost via a hardware outage or kill -9.  Subsequently when the lost RS+DS is restarted, replication fails to fully catch up.

      Note: when the fix for OPENDJ-4041 is applied, replication does catch up.

      Testcase:

      1. Setup two DS+RS servers in MMR.
      2. Add 100 entries and check the firstChangeNumber & lastChangeNumber
      3. Kill -9 Master 2
      4. Add 100 entries and check the firstChangeNumber & lastChangeNumber.

      At this point the entries added in step 4 are ot populated to the changelogDb.  This is bad for applications that rely on it such as IDM.

      Data: 

      After step 2 you can see the initial 100 entries added to both servers chaneglogDb and the changelogdbdump.jar shows 100 Keys from the changenumberindex.

       

      opendj; bin/$ date; ./ldapsearch --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --baseDN "" --searchScope base '(objectClass=*)' firstChangeNumber lastChangeNumber
      Thu Jul 27 22:41:50 MDT 2017
      dn:
      firstChangeNumber: 1
      lastChangeNumber: 100
      
      opendj; changelogDb/$ ./dumpchangelog
      [170727-224153] Final CL Key - key=100 | value=changeNumber=100 csn=0000015d877f9ec466d300000064 baseDN=dc=example,dc=com
      
      Thu Jul 27 22:41:17 MDT 2017
      Suffix DN         : Server                    : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
      ------------------:---------------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
      dc=example,dc=com : opendj.forgerock.com:4444 : 1100    : true                : 26323 : 2856  : 8989        : 0        :              : true
      dc=example,dc=com : opendj.forgerock.com:5444 : 1100    : true                : 29152 : 11145 : 9989        : 0        :              : true
      

      After killing Master 2 and adding 100 more entries to Master 1 in step 4, the changelogDb has no new entries and the changelogdbdump.jar still shows 100 from the changenumberindex

       

       

      opendj; bin/$ date; ./ldapsearch --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --baseDN "" --searchScope base '(objectClass=*)' firstChangeNumber lastChangeNumber
      Thu Jul 27 22:48:11 MDT 2017
      dn:
      firstChangeNumber: 1
      lastChangeNumber: 100
      
      opendj; changelogDb/$ ./dumpchangelog
      [170727-224821] Final CL Key - key=100 | value=changeNumber=100 csn=0000015d877f9ec466d300000064 baseDN=dc=example,dc=com
      

      If you restart Master 2, both Masters changelogDb's are populated but replication does not catch up....just like OPENDJ-4041

      opendj; bin/$ date; ./ldapsearch --port 1389 --bindDN "cn=Directory Manager" --bindPassword password --baseDN "" --searchScope base '(objectClass=*)' firstChangeNumber lastChangeNumber
      Thu Jul 27 22:51:33 MDT 2017
      dn:
      firstChangeNumber: 1
      lastChangeNumber: 200
      
      opendj; changelogDb/$ ./dumpchangelog
      [170727-225157] Final CL Key - key=200 | value=changeNumber=200 csn=0000015d87845fff66d3000000c8 baseDN=dc=example,dc=com
      opendj; bin/$ date; ./dsreplication status --adminUID admin --adminPassword password --hostname opendj.forgerock.com --port 4444 --trustAll
      Thu Jul 27 22:51:13 MDT 2017
      Suffix DN         : Server                    : Entries : Replication enabled : DS ID : RS ID : RS Port (1) : M.C. (2) : A.O.M.C. (3) : Security (4)
      ------------------:---------------------------:---------:---------------------:-------:-------:-------------:----------:--------------:-------------
      dc=example,dc=com : opendj.forgerock.com:4444 : 1200    : true                : 26323 : 2856  : 8989        : 0        :              : true
      dc=example,dc=com : opendj.forgerock.com:5444 : 1147    : true                : 29152 : 11145 : 9989        : 0        :              : true
      

       

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                lee.trujillo Lee Trujillo
                Reporter:
                lee.trujillo Lee Trujillo
              • Votes:
                0 Vote for this issue
                Watchers:
                7 Start watching this issue

                Dates

                • Created:
                  Updated: