Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-6347

Upgrading a replication topology with encrypted changelog generates error messages

    Details

      Description

      Found with rev f207cb63bbd

      We set up 2 servers with 3.5.0 version.
      We configure replication between them and initialize it.
      Then we enable confidentiality on the changelog for both of them.

      We perform some operations to check everything is working fine.
      Then we proceed with the upgrade for both servers.
      Once they are upgraded, we perform again some operations.
      We check the changelog are in sync (exporting as ldif then ldifdiff)
      and that the entries are where expected.
      But when adding or modifying an entry, it looks like it triggers a lot of error messages in the log:

      ...
      [06/Jun/2019:17:07:33 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44112 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "dc=com". The connection attempt from replication server RS(11464) at /127.0.0.1:44112 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:07:33 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:07:37 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44126 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "cn=admin data". The connection attempt from replication server RS(11464) at /127.0.0.1:44126 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:07:37 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:07:41 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44140 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "cn=schema". The connection attempt from replication server RS(11464) at /127.0.0.1:44140 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:07:41 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:07:47 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketTimeoutException(Read timed out)
      [06/Jun/2019:17:07:47 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44162 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:07:52 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketTimeoutException(Read timed out)
      [06/Jun/2019:17:07:52 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44180 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:07:56 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44228 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "cn=schema". The connection attempt from replication server RS(11464) at /127.0.0.1:44228 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:07:56 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:02 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44278 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "dc=com". The connection attempt from replication server RS(11464) at /127.0.0.1:44278 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:08:02 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:07 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44304 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "cn=admin data". The connection attempt from replication server RS(11464) at /127.0.0.1:44304 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:08:07 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:13 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketTimeoutException(Read timed out)
      [06/Jun/2019:17:08:13 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44312 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:18 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44326 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "dc=com". The connection attempt from replication server RS(11464) at /127.0.0.1:44326 to this replication server RS(11464) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [06/Jun/2019:17:08:18 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:19 +0200] category=CORE severity=NOTICE msgID=139 msg=The Directory Server has sent an alert notification generated by class org.opends.server.core.DirectoryServer (alert type org.opends.server.DirectoryServerShutdown, alert ID org.opends.messages.core-141): The Directory Server has started the shutdown process. The shutdown was initiated by an instance of class org.opends.server.core.DirectoryServerShutdownHook and the reason provided for the shutdown was The Directory Server shutdown hook detected that the JVM is shutting down. This generally indicates that JVM received an external request to stop (e.g., through a kill signal)
      [06/Jun/2019:17:08:19 +0200] category=PROTOCOL severity=NOTICE msgID=277 msg=Stopped listening for new connections on Administration Connector 0.0.0.0:4451
      [06/Jun/2019:17:08:19 +0200] category=PROTOCOL severity=NOTICE msgID=277 msg=Stopped listening for new connections on LDAP Connection Handler 0.0.0.0:1396
      [06/Jun/2019:17:08:19 +0200] category=PROTOCOL severity=NOTICE msgID=277 msg=Stopped listening for new connections on LDAPS Connection Handler 0.0.0.0:1643
      [06/Jun/2019:17:08:23 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8996 but has disconnected in handshake phase. Error: SocketTimeoutException(Read timed out)
      [06/Jun/2019:17:08:23 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 11464 was attempting to connect to replication server /127.0.0.1:44338 but has disconnected in handshake phase. Error: SocketException(no more data)
      [06/Jun/2019:17:08:23 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8997 but an error occurred in handshake phase. Error: InterruptedException(AbstractQueuedSynchronizer.java:1220)
      
      

      Note that we also have new error messages:

      (ERROR) [06/Jun/2019:17:08:23 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 11464 was attempting to connect to replication server openam.example.com/127.0.0.1:8997 but an error occurred in handshake phase. Error: InterruptedException(AbstractQueuedSynchronizer.java:1220)
      

      This looks like a regression.
      To reproduce:

      ./run-pybot.py -s replication_group3.Upgrade -t Encrypted_replication_topology  opendj
      

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                cforel carole forel
                Reporter:
                cforel carole forel
                QA Assignee:
                carole forel
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: