[OPENAM-6734] Shutdown race condition between embedded OpenDJ and OpenAM persistent search restart Created: 01/Sep/15  Updated: 20/Nov/16  Resolved: 12/Oct/15

Status: Resolved
Project: OpenAM
Component/s: idrepo
Affects Version/s: 12.0.1
Fix Version/s: 12.0.3, 13.0.0

Type: Bug Priority: Minor
Reporter: Ian Packer [X] (Inactive) Assignee: Ian Packer [X] (Inactive)
Resolution: Fixed Votes: 1
Labels: EDISON, release-notes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
is related to OPENAM-7076 Improve error handling or refactor us... Open
Target Version/s:
Rank: 1|hzq2fz:
Support Ticket IDs:

 Description   

During a container initiated shutdown where an embedded OpenDJ is in use, at least one potential race condition exists between the OpenDJ shutdown and OpenAM shutdown processes.

If the OpenDJ shutdown leads to a persistent search socket being closed at the server side, the client (OpenAM) can receive a notification of this before the DJLDAPv3PersistentSearch class knows it is shutting down - this filters down and causes it to attempt to immediately restart the connection.

Restarting the psearch involves getting a SystemTimerPool instance, but if the OpenAM shutdown manager already knows it is shutting down it won't allow a shutdown listener to be added - leading to a rogue TimerPool instance with no shutdown hook and then throwing an uncaught runtime exception:

java.lang.IllegalMonitorStateException: Failed to acquire lock registering the ShutdownListener
    at com.sun.identity.common.ShutdownManager.addShutdownListener(ShutdownManager.java:146)
    at com.sun.identity.common.ShutdownManager.addShutdownListener(ShutdownManager.java:126)
    at com.sun.identity.common.SystemTimerPool.getTimerPool(SystemTimerPool.java:78)
    at org.forgerock.openam.idrepo.ldap.psearch.DJLDAPv3PersistentSearch.restartPSearch(DJLDAPv3PersistentSearch.java:250)
    at org.forgerock.openam.idrepo.ldap.psearch.DJLDAPv3PersistentSearch.access$700(DJLDAPv3PersistentSearch.java:65)
    at org.forgerock.openam.idrepo.ldap.psearch.DJLDAPv3PersistentSearch$PSearchResultHandler.handleErrorResult(DJLDAPv3PersistentSearch.java:357)
    at org.forgerock.opendj.ldap.HeartBeatConnectionFactory$ConnectionImpl$AbstractWrappedResultHandler.handleErrorResult(HeartBeatConnectionFactory.java:291)
    at com.forgerock.opendj.util.AsynchronousFutureResult$Sync.innerSetErrorResult(AsynchronousFutureResult.java:173)
    at com.forgerock.opendj.util.AsynchronousFutureResult.handleErrorResult(AsynchronousFutureResult.java:311)
    at com.forgerock.opendj.ldap.AbstractLDAPFutureResultImpl.setResultOrError(AbstractLDAPFutureResultImpl.java:138)
    at com.forgerock.opendj.ldap.AbstractLDAPFutureResultImpl.adaptErrorResult(AbstractLDAPFutureResultImpl.java:127)
    at com.forgerock.opendj.ldap.LDAPConnection.close(LDAPConnection.java:690)
    at com.forgerock.opendj.ldap.LDAPClientFilter.handleClose(LDAPClientFilter.java:480)

If the system is shutting down then getTimerPool should probably fail more gracefully, but this also means there are a lot of places where getTimerPool usage needs fixing because it is usually expected/assumed to be 'guaranteed'.



 Comments   
Comment by Ian Packer [X] (Inactive) [ 12/Oct/15 ]

Fixed in be34d3a7cf588cb751d61e9acc86f1ff328e0344

Generated at Tue Mar 09 10:31:00 UTC 2021 using Jira 7.13.12#713012-sha1:6e07c38070d5191bbf7353952ed38f111754533a.