Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-5503

Change number "not found in pending list"

    Details

    • Type: Bug
    • Status: Dev backlog
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: 6.0.0
    • Fix Version/s: None
    • Component/s: replication
    • Labels:
    • Support Ticket IDs:

      Description

      Reported by a customer on 6.0.0:

      • replication topology with two DS-only, and two DS+RS instances.
      • at restart, the DS+RS instances' error logs have thousands of errors like this (the DS-only instances have such errors too, but only a few) :
      [20/Sep/2018:12:13:35 +0800] category=SYNC severity=ERROR msgID=9 msg=Internal Error : Operation ModifyOperation(connID=-2, opID=70, dn=uid=user1,ou=test,o=test1.com) change number 00000165c9ce8aa9005100114682 was not found in pending list
      • the errors are for lots of different CSNs (and different DNs).
      • many (or all?) of the CSNs have been purged from the changelogs.

       

      I have been able to reproduce this only once so far (no consistently reproducible test case):

      • once the error happens, it happens at each restart after that, on the same CSN.

      My steps were:

      • DS 6.0.0. Three instances: DS1 (DS-only), DS+RS2, DS+RS3.
      • set up "dc=example,dc=com", and a separate backend "ou=sub,dc=example,dc=com".
      • I set a global server-id for each instance (81, 82, 83).
      • configure replication for the suffix and sub-suffix.
      • add entries on all instances.
      • set replication-purge-delay to several hours on DS+RS2 and DS+RS3.
      • add more entries.
      • lowered purge delay again on DS+RS2 only (this time to 1 minute).
      • old changes got purged.
      • restarted DS+RS2, and the "not found in pending list" error was reported on DS+RS3.
      • after that, the error is reported on DS+RS3 at each restart of DS+RS3:
      [25/Sep/2018:14:07:39 +0800] category=SYNC severity=ERROR msgID=9 msg=Internal Error : Operation ModifyOperation(connID=-2, opID=300182, dn=uid=user.1,ou=sub,dc=example,dc=com) change number 000001660ea25907005200025988 was not found in pending list
      • in my case, the "not found in pending list" CSN originates from DS+RS2, and is the "newest-csn":
      dn: ds-mon-server-id=82,cn=replica dbs,ds-mon-domain-name=ou=sub\,dc=example\,dc=com,cn=changelog,cn=replication,cn=monitor
      objectClass: top
      objectClass: ds-monitor
      objectClass: ds-monitor-replica-db
      ds-mon-oldest-csn: 000001660a82bc70005200000692
      ds-mon-oldest-csn-timestamp: 20180924073712.048Z
      ds-mon-newest-csn: 000001660ea25907005200025988
      ds-mon-newest-csn-timestamp: 20180925025012.615Z
      ds-mon-server-id: 82

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              wei-yee.lum Wei-Yee Lum
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: