Uploaded image for project: 'OpenDJ'
  1. OpenDJ
  2. OPENDJ-4392

NPE in JE deadlock detection logic during add/del stress test

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.5.0, 4.0.0, 3.5.2
    • Fix Version/s: 5.5.0
    • Component/s: replication
    • Labels:
      None
    • Sprint:
      OpenDJ Sprint 112

      Description

      Scenario

      Running pyforge replication addrate on 3 machines during 48 hours (Test started october 8, 01:27) :

      • one machine (brie) running addrate client against DJ1
      • one machine (raclette) for DJ1
      • one machine (morbier) for DJ2

      Problem description

      • DJ1 raised an NPE during the test ; october 9, 20:45 :

      Extract from DJ1 access log :

       

      ldap-access.audit.json:{"eventName":"DJ-LDAP","client":{"ip":"172.16.204.139","port":58602},"server":{"ip":"172.16.204.253","port":1389},"request":{"protocol":"LDAP","operation":"ADD","connId":127916051,"msgId":2,"dn":"uid=user.64161884,ou=People,o=example"},"transactionId":"8e0f0152-2de0-4dcc-9465-9148c5aec10d-1151262168","response":{"status":"FAILED","statusCode":"80","elapsedTime":16964361,"elapsedTimeUnits":"NANOSECONDS","detail":"Unchecked exception during database transaction: NullPointerException (LockManager.java:1942 LockManager.java:1997 LockManager.java:1895 LockManager.java:872 LockManager.java:475 LockManager.java:345 BasicLocker.java:123 ReadCommittedLocker.java:90 Locker.java:498 CursorImpl.java:3641 CursorImpl.java:3372 CursorImpl.java:2156 CursorImpl.java:1968 Cursor.java:4195 Cursor.java:4056 Cursor.java:3858 Cursor.java:1282 Database.java:1342 Database.java:1401 JEStorage.java:360 DefaultIndex.java:228 ...)"},"timestamp":"2017-10-09T15:20:45.960Z","_id":"8e0f0152-2de0-4dcc-9465-9148c5aec10d-1151262259"}

       

      •  At the end of test,
        • ldapsearch on ldap port (1389) is working
        • admin port (4444) is not available, dsconfig returns
      Unable to connect to the server at "raclette" on port 4444 
      

      addrate client

      • addrate client started at october 8, 01:27

       

      addrate -h raclette.internal.forgerock.com -p 1389 -D "cn=Directory Manager" -w password -M 10000 -d 172800 -S -C random -c 40 -i 30 -s 10000 addrate.template

       

      • addrate throughput goes down after 1 day 15 hr. 54 min. of run ; october 9, 17:20
      • Extract from addrate output :

       

      143520.000,857.8,890.8,40.091,38.331,780.14,1535.12,2986.34,0.0,50.01
      143550.000,875.1,890.8,39.090,38.331,780.14,1535.12,2986.34,0.0,49.99
      143580.000,898.8,890.8,37.984,38.331,780.14,1535.12,2986.34,0.0,49.99
      143610.000,461.5,890.7,38.113,38.331,780.14,1535.12,2986.34,0.0,50.00
      143640.000,0.0,890.5,-,38.331,780.14,1535.12,2986.34,0.0,-
      143670.001,0.0,890.3,-,38.331,780.14,1535.12,2986.34,0.0,-
      143700.000,0.0,890.1,-,38.331,780.14,1535.12,2986.34,0.0,-
      143730.000,0.0,890.0,-,38.331,780.14,1535.12,2986.34,0.0,-
      [...]
      205770.001,0.0,621.6,-,38.331,780.14,1535.12,2986.34,0.0,-
      205800.000,0.0,621.5,-,38.331,780.14,1535.12,2986.34,0.0,-
      

       

      Report available

      http://abondance.internal.forgerock.com/qaresults/OpenDJ/4.1.0/manual/replication.addrate/RC3/

      DJ1

       DJ2

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                matthew Matthew Swift
                Reporter:
                guillaume.andru Guillaume Andru
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: