Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-5648

Change number indexing does not work on RS-only

    Details

    • Type: Bug
    • Status: Done
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 6.5.0
    • Fix Version/s: 6.5.0
    • Component/s: replication
    • Labels:
      None
    • Flagged:
      Impediment
    • Epic Link:
    • Story Points:
      2

      Description

      UPDATE:
      With RC2 it looks better, see new images. This time I have run the test for 3h and at some point the size of changelog stopped to increase, however it doesn't look like it's flat since 1st 1/3 of the test. I also noticed at the end of the test that changelog size is greater than 10G which is limit in the test. The size was 28G.

      The test also checks `lastchangenumber` and it's 0, which is not expected. I saw this issue with `lastchangenumber` also in daily stress with RC2 in a same test, but everything on same machine.


      Using 6.5.0-RC1 with long run test we have a problem with changelog purge. In 12h test we expect that after 4h the purge start, but the changelog size is still increasing. After 12h we have changelog with size 150G.  

      The test runs on 5 machines and consists of 2 DSs and 2RSs and 2 modrate clients.

      Clients are on one machine and each DJ instance is on separate machine.

      The replication-purge-delay is set to 4h:

      ./RS1/opendj/bin/dsconfig -h morbier.internal.forgerock.com -p 4444 -D "cn=Directory Manager" -w password -X set-replication-server-prop --provider-name "Multimaster Synchronization" --set replication-purge-delay:14400s -n
      
      ./RS2/opendj/bin/dsconfig -h raclette.internal.forgerock.com -p 4444 -D "cn=Directory Manager" -w password -X set-replication-server-prop --provider-name "Multimaster Synchronization" --set replication-purge-delay:14400s -n

       

      Two modrate clients are started with following commands:

      ./SDK1/opendj-ldap-toolkit/bin/modrate -h brie.internal.forgerock.com -p 1389 -D "cn=Directory Manager" -w password -M 8000 -d 43200 -b uid=user_{1},dc=europe,dc=com -S -F -g "rand(0,99999)" -c 5 -t 6 -i 10 -g "randstr(10,[0-9])" "employeeType:{2}"
      
      ./SDK2/opendj-ldap-toolkit/bin/modrate -h tomme.internal.forgerock.com -p 1389 -D "cn=Directory Manager" -w password -M 8000 -d 43200 -b uid=user_{1},dc=europe,dc=com -S -F -g "rand(0,99999)" -c 5 -t 6 -i 10 -g "randstr(10,[0-9])" "employeeType:{2}"

       

      We have run the same test (but on 3 machines - DS1 and RS2 on one machine, DS2 and RS2 on another machine and one machine for clients ) on snapshot (6376bb171d8) and we didn't have this issue.


       

      The test in pyforge:

      python3 run-pybot.py -v -c stress -s replication_split_DSRS DJ
      

        Attachments

        1. debug.txt
          7.88 MB
        2. RC2_RS1_dir_size.png
          RC2_RS1_dir_size.png
          100 kB
        3. RC2_RS2_dir_size.png
          RC2_RS2_dir_size.png
          100 kB
        4. RS1_dir_size.png
          RS1_dir_size.png
          81 kB
        5. RS2_dir_size.png
          RS2_dir_size.png
          82 kB

          Issue Links

            Activity

              People

              • Assignee:
                JnRouvignac Jean-Noël Rouvignac
                Reporter:
                ondrej.fuchsik Ondrej Fuchsik
                Dev Assignee:
                Jean-Noël Rouvignac
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: