Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-6385

Replication and upgrade: Logs flooded with SocketException (no more data)

    XMLWordPrintable

    Details

      Description

      Found with rev (4fe26a00259)

      We set up 2 servers with 5.5.1 version and replicate them.
      We stop and upgrade one of them:

      ./DJ_TASKS2/opendj/upgrade -n --acceptLicense --force 	
      
      >>>> OpenDJ Upgrade Utility
      
       * OpenDJ configuration will be upgraded to version
       7.0.0.4fe26a00259a981f151bb77aa0490488ddab9be0
       * OpenDJ data will be upgraded to version
       7.0.0.4fe26a00259a981f151bb77aa0490488ddab9be0
       * See
       '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/logs/upgrade.log'
       for a detailed log of this operation
      
      >>>> Preparing to upgrade
      
      Unable to retrieve the hostname from the admin backend using the truststore as
      source of keys; 'advertised-listen-address' attribute in global configuration
      will use the local hostname as value. Cause: Unable to retrieve the keys from
      the truststore
        OpenDJ 6.5.0 changed the indexing algorithm for replication metadata. Its
        index must be rebuilt which may take a long time; without a working index
        every server start will take longer than normal. Do you want to rebuild the
        index automatically at the end of the upgrade? (yes/no) yes
      
        The upgrade is ready to proceed. Do you wish to continue? (yes/no) yes
      
      
      >>>> Performing upgrade
      
        Removing configuration entries for the monitor providers.   0%       Removing configuration entries for the monitor providers............   100%     
        Removing file
      '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bin/control-panel'
      .   0%     '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bin/control-panel'
      .   100%     
        Removing file
      '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bin/ControlPanel.app'
      .   0%     '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bin/ControlPanel.app'
      .   100%     
        Removing file
      '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bat/control-panel.bat'
      .   0%     '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/bat/control-panel.bat'
      .   100%     
        Removing file
      '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/lib/forgerock-persistit-core.jar'
      .   0%     '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/lib/forgerock-persistit-core.jar'
      .   100%     
        Replacing low durability settings in JE backends.   0%       Replacing low durability settings in JE backends....................   100%     
        Replacing high durability settings in JE backends.   0%       Replacing high durability settings in JE backends...................   100%     
        Replacing medium durability settings in JE backends.   0%       Replacing medium durability settings in JE backends.................   100%     
        Replacing medium durability settings in JE backends.   0%       Replacing medium durability settings in JE backends.................   100%     
        Renaming 'ds-cfg-json-schema' object class to
        'ds-cfg-json-query-equality-matching-rule'.   0%       'ds-cfg-json-query-equality-matching-rule'..........................   100%     
        Migrating root DN 'cn=Directory Manager'.   0%       Migrating root DN 'cn=Directory Manager'............................   100%     
        Removing root DN users from configuration.   0%       Removing root DN users from configuration...........................   100%     
        Segregating mutable and immutable files.   0%       Segregating mutable and immutable files.............................   100%     
        Update admin-backend.ldif file location.   0%       Update admin-backend.ldif file location.............................   100%     
        Update tasks backend file location.   0%       Update tasks backend file location..................................   100%     
        Replacing all pin related configuration attributes by a single pin
        configuration attribute.   0%       configuration attribute.............................................   100%     
        Removing send and receive window size configuration in replication.   0%       Removing send and receive window size configuration in replication..   100%     
        Replacing compute change number setting in replication server.   0%       Replacing compute change number setting in replication server.......   100%     
        Replacing compute change number setting in replication server.   0%       Replacing compute change number setting in replication server.......   100%     
        Replacing "reject unauthenticated requests" policy in global
        configuration.   0%       configuration.......................................................   100%     
        Replacing "reject unauthenticated requests" policy in global
        configuration.   0%       configuration.......................................................   100%     
        Removing configuration for assured replication.   0%       Removing configuration for assured replication......................   100%     
        Removing configuration for assured replication.   0%       Removing configuration for assured replication......................   100%     
        Removing generation-id data from configuration.   0%       Removing generation-id data from configuration......................   100%     
        Removing synchronization state data from configuration.   0%       Removing synchronization state data from configuration..............   100%     
        Renaming the proxy backend configuration property 'service
        discovery mechanism' to 'shard'.   0%       discovery mechanism' to 'shard'.....................................   100%     
        Adding objectClass to JSON, CSV, and External access logger
        configurations.   0%       configurations......................................................   100%     
        Use the old JE backend caches instead of the new shared cache.   0%       Use the old JE backend caches instead of the new shared cache.......   100%     
        Removing configuration for replication monitoring.   0%       Removing configuration for replication monitoring...................   100%     
        Setting global server ID.   0%       Setting global server ID............................................   100%     
        Removing global server ID from replication domains and replication
        server.   0%       server...............................   50%       server..............................................................   100%     
        Consolidating bootstrap replication servers from replication server
        and replication domains into Multimaster Synchronization provider.   0%       and replication domains into Multimaster Synchronization provider...   100%     
        Set the proxy backend configuration property 'hash-function' to MD5.   0%       Set the proxy backend configuration property 'hash-function' to MD5.   100%     
        Update replication SSL configuration.   0%       Update replication SSL configuration................................   100%     
        Adding Mail Servers.   0%       Adding Mail Servers.................................................   100%     
        Renaming ds-cfg-connection-pool-min-size attributes to
        ds-cfg-bind-connection-pool-min-size.   0%       ds-cfg-bind-connection-pool-min-size................................   100%     
        Renaming ds-cfg-connection-pool-max-size attributes to
        ds-cfg-bind-connection-pool-max-size.   0%       ds-cfg-bind-connection-pool-max-size................................   100%     
        Renaming ds-cfg-connection-pool-idle-timeout attributes to
        ds-cfg-bind-connection-pool-idle-timeout.   0%       ds-cfg-bind-connection-pool-idle-timeout............................   100%     
        Set the database cache mode 'ds-cfg-db-cache-mode' to 'cache-ln'.   0%       Set the database cache mode 'ds-cfg-db-cache-mode' to 'cache-ln'....   100%     
        Add 'inheritFromDNParent' attribute type to the
        'inheritedCollectiveAttributeSubentry' object class.   0%       'inheritedCollectiveAttributeSubentry' object class.................   100%     
        Adding 'listen-address' and 'advertised-listen-address' attributes
        to the global configuration.   0%       to the global configuration.........................................   100%     
        Removing 'listen-address' attributes that are redundant with
        default value provided by the global configuration.   0%       default value provided by the global configuration..................   100%     
        Replacing schema file '02-config.ldif'.   0%       Replacing schema file '02-config.ldif'......   20%       Replacing schema file '02-config.ldif'..............................   100%     
        Archiving concatenated schema.   0%       Archiving concatenated schema.......................................   100%     
        Migrating replication changelog files to 6.5.0 format.   0%       Migrating replication changelog files to 6.5.0 format..   14%       Migrating replication changelog files to 6.5.0 format....   28%       Migrating replication changelog files to 6.5.0 format......   42%       Migrating replication changelog files to 6.5.0 format........   57%       Migrating replication changelog files to 6.5.0 format..........   71%       Migrating replication changelog files to 6.5.0 format............   85%       Migrating replication changelog files to 6.5.0 format...............   100%     
      
      >>>> OpenDJ configuration was successfully upgraded to version
      7.0.0.4fe26a00259a981f151bb77aa0490488ddab9be0
      
      
      >>>> OpenDJ data was successfully upgraded to version
      7.0.0.4fe26a00259a981f151bb77aa0490488ddab9be0
      
      
      >>>> Performing post upgrade tasks
      
        Rebuilding index(es)
        '[ds-sync-hist.changeSequenceNumberOrderingMatch]' for base dn(s)
        '[dc=com]'.   0%       '[dc=com]'...............   25%       '[dc=com]'..........................................................   100%     
      
      >>>> Post upgrade tasks complete
      
       * See
       '/local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj/logs/upgrade.log'
       for a detailed log of this operation
      

      Then we start the server:

      DJ_TASKS2/opendj/bin/start-ds 	
      
      [24/Jun/2019:12:40:53 +0200] category=CORE severity=NOTICE msgID=134 msg=ForgeRock Directory Services 7.0.0-SNAPSHOT (build 20190624092639, revision number 4fe26a00259a981f151bb77aa0490488ddab9be0) starting up
      [24/Jun/2019:12:40:53 +0200] category=JVM severity=NOTICE msgID=21 msg=Installation Directory:  /local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj
      [24/Jun/2019:12:40:53 +0200] category=JVM severity=NOTICE msgID=23 msg=Instance Directory:      /local/GIT/pyforge/results/20190624-123718/upgrade_group/UpgradeTasks/DJ_TASKS2/opendj
      [24/Jun/2019:12:40:53 +0200] category=JVM severity=NOTICE msgID=17 msg=JVM Information: 1.8.0_191-b12 by Oracle Corporation, 64-bit architecture, 3717201920 bytes heap size
      [24/Jun/2019:12:40:53 +0200] category=JVM severity=NOTICE msgID=18 msg=JVM Host: cforel-Dell-Precision-M3800 default/26779, running Linux 4.4.0-21-generic amd64, 16725336064 bytes physical memory size, number of processors available 8
      [24/Jun/2019:12:40:53 +0200] category=JVM severity=NOTICE msgID=19 msg=JVM Arguments: "-Dorg.opends.server.scriptName=start-ds"
      [24/Jun/2019:12:40:54 +0200] category=BACKEND severity=NOTICE msgID=513 msg=The database backend userRoot containing 209 entries has started
      [24/Jun/2019:12:40:54 +0200] category=EXTENSIONS severity=NOTICE msgID=221 msg=DIGEST-MD5 SASL mechanism using a server fully qualified domain name of: openam.example.com
      [24/Jun/2019:12:40:55 +0200] category=SYNC severity=NOTICE msgID=204 msg=Replication server RS(26779) started listening for new connections on address 0.0.0.0 port 8991
      [24/Jun/2019:12:41:01 +0200] category=SYNC severity=ERROR msgID=178 msg=Directory server 26779 was attempting to connect to replication server 26779 but has disconnected in handshake phase. Error: SocketException(no more data)
      [24/Jun/2019:12:41:01 +0200] category=SYNC severity=NOTICE msgID=62 msg=Directory server DS(26779) has connected to replication server RS(26779) for domain "cn=schema" at 127.0.1.1:8991 with generation ID 8408
      [24/Jun/2019:12:41:01 +0200] category=SYNC severity=NOTICE msgID=62 msg=Directory server DS(26779) has connected to replication server RS(26779) for domain "cn=admin data" at 127.0.1.1:8991 with generation ID 157621
      [24/Jun/2019:12:41:01 +0200] category=SYNC severity=NOTICE msgID=62 msg=Directory server DS(26779) has connected to replication server RS(26779) for domain "dc=com" at 127.0.1.1:8991 with generation ID 3900028
      [24/Jun/2019:12:41:01 +0200] category=PROTOCOL severity=NOTICE msgID=276 msg=Started listening for new connections on Administration Connector 0.0.0.0:4446
      [24/Jun/2019:12:41:01 +0200] category=PROTOCOL severity=NOTICE msgID=276 msg=Started listening for new connections on LDAP Connection Handler 0.0.0.0:1391
      [24/Jun/2019:12:41:01 +0200] category=PROTOCOL severity=NOTICE msgID=276 msg=Started listening for new connections on LDAPS Connection Handler 0.0.0.0:1638
      [24/Jun/2019:12:41:01 +0200] category=CORE severity=NOTICE msgID=135 msg=The Directory Server has started successfully
      [24/Jun/2019:12:41:01 +0200] category=CORE severity=NOTICE msgID=139 msg=The Directory Server has sent an alert notification generated by class org.opends.server.core.DirectoryServer (alert type org.opends.server.DirectoryServerStarted, alert ID org.opends.messages.core-135): The Directory Server has started successfully
      

      We can see the SocketException error.
      And looking at the logs:

      replication logs:
      
      [24/Jun/2019:12:41:33 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 26779 was attempting to connect to replication server /127.0.0.1:45490 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "dc=com". The connection attempt from replication server RS(26779) at /127.0.0.1:45490 to this replication server RS(26779) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [24/Jun/2019:12:41:33 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 26779 was attempting to connect to replication server openam.example.com/127.0.0.1:8991 but has disconnected in handshake phase. Error: SocketException(no more data)
      [24/Jun/2019:12:41:43 +0200] category=SYNC severity=ERROR msgID=178 msg=Directory server 10596 was attempting to connect to replication server 26779 but has disconnected in handshake phase. Error: SocketException(no more data)
      [24/Jun/2019:12:42:01 +0200] category=SYNC severity=ERROR msgID=343 msg=Replication server 26779 was attempting to connect to replication server /127.0.0.1:45552 but an error occurred in handshake phase. Error: Other: Timed out while trying to acquire the domain lock for domain "dc=com". The connection attempt from replication server RS(26779) at /127.0.0.1:45552 to this replication server RS(26779) will be aborted. This is probably benign and a result of a simultaneous cross connection attempt
      [24/Jun/2019:12:42:01 +0200] category=SYNC severity=ERROR msgID=179 msg=Replication server 26779 was attempting to connect to replication server openam.example.com/127.0.0.1:8991 but has disconnected in handshake phase. Error: SocketException(no more data)
      ....
      
      
      

      This is reproducible with:

      ./run-pybot.py -v -s upgrade_group.UpgradeTasks opendj
      

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              fabiop Fabio Pistolesi
              Reporter:
              cforel carole forel
              QA Assignee:
              Michal Severin [X] Michal Severin [X] (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: