[OPENIDM-4626] Routes to OpenIDM OSGi Services are lost during startup Created: 18/Nov/15  Updated: 20/Nov/16  Resolved: 20/Nov/16

Status: Closed
Project: OpenIDM
Component/s: Module - Core mapping, synchronization, reconciliation, Module - OSGi Container / Framework integration
Affects Version/s: OpenIDM 3.0.0, OpenIDM 3.1.0
Fix Version/s: OpenIDM 4.5.0

Type: Bug Priority: Major
Reporter: Chris Drake Assignee: Chris Drake
Resolution: Fixed Votes: 0
Labels: CARTIER
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by OPENIDM-3905 Can't query any managed object via RE... Closed
is duplicated by OPENIDM-4706 NPE in FilteredIterable during livesync Closed
Relates
is related to OPENIDM-3524 Resource managed/user randomly not av... Closed
is related to OPENIDM-4706 NPE in FilteredIterable during livesync Closed
is related to OPENIDM-5150 JSON configuration files always reloa... Closed
Target Version/s:
Cases: 9436
Support Ticket IDs:

 Description   

During startup OpenIDM will occasionally fail to properly register routes to various OpenIDM Services. This is typically observed when there is latency between the OpenIDM instance and the JDBC Repository which causes the JDBC Repo Service to take longer than usual to start.

When the issue occurs for the ManagedObjectService, attempting to query a managed object defined with managed.json would result in the following:

curl -u openidm-admin:openidm-admin http://localhost:8080/openidm/managed/user?_queryId=query-all-ids
{"code":404,"reason":"Not Found","message":"Resource 'user' not found"}

Similar failures may occur for other Services whose routes have been 'lost' during startup.

The problem can be simulated explicitly via the following sequence of events:

  1. Start the OpenIDM instance
  2. Open the Apache Felix console via http://localhost:8443/system/console
  3. Edit the conf/managed.json file and add a new managed object type
  4. Within the Apache Felix console, navigate to Components and locate the org.forgerock.openidm.managed component
  5. Stop the org.forgerock.openidm.managed component
  6. Start the org.forgerock.openidm.managed component
  7. Perform a query-all-ids against maanaged/user and the failure should occur

Note that the above simply simulates the issue. At startup the same sequence of Service modification, followed by deactivation and reactivation is was causes the failure. However in the case of a startup the sequence is triggered by the parallel startup of multiple OSGi Services.



 Comments   
Comment by Laurent Bristiel [X] (Inactive) [ 18/Nov/15 ]

problem was already raised in OPENIDM-3524 and is causing us some pain in QA for our automated tests.
It would be a huge relief if someone could fix it!

Comment by Chris Drake [ 18/Nov/15 ]

The cause of the problem is within RouteRegistryImpl which is responsible for adding the routes to the various OpenIDM Services when they are activated, and removing the routes when those same services are deactivated.

The RouteRegistryImpl class is a implementation of ServiceTrackerCustomizer and implements the addingService, modifiedService and removedService methods. The problem is caused by the implementation of the modifiedService method which is the following:

    public void modifiedService(ServiceReference<Object> reference, RouteEntryImpl service) {
        service.removeRoute();
        Object newService = context.getService(reference);
        RouteEntry result = null;
        if (newService instanceof CollectionResourceProvider) {
            result = addRoute(reference, newService);
        } else if (newService instanceof SingletonResourceProvider) {
            result = addRoute(reference, newService);
        } else if (newService instanceof RequestHandler) {
            result = addRoute(reference, newService);
        }
        if (null == result) {
            context.ungetService(reference);
        }
    }

The above removes the routes for the existing service and creates a new RouteService whose routes are re-registered. The problem is that the new RouteService is not tracked by the ServiceTracker (as is the case when calling addingService) and therefore the OSGi Framework does not appear re-wire things correctly. The result is that requests are routed to the OLD RouteService instance whose routes have been removed and not the NEW instance.

Based on my review of the code, I believe our implementation of modifiedService() should be a no-op which is the implementation within the OSGi Framework as well. It is not possible for the routes associated with our OSGi Services to change at runtime and it does not appear it is necessary for us to be creating a new RouteServiceImpl each time a service is modified. Changing the modifiedService() method to a no-op resolves the issue and does not appear to impact any other functionality.

Comment by Chris Drake [ 18/Nov/15 ]

Laurent Bristiel [X] I didn't see your comment until just now. I've not yet been able to reproduce this issue on the recent 4.0.0 builds and either the problem is fixed as a result of a change somewhere else in the code, or as a result of having upgraded the Apache Felix Framework, or perhaps it is simply much more difficult to reproduce.

Have you seen the test failures you've described in OPENIDM-3524 on any of the recent 4.0.0 builds?

Comment by Laurent Bristiel [X] (Inactive) [ 19/Nov/15 ]

Chris Drake no, I did not see it recently but we did put many things in place in our test to workaround this problem (like silently retrying to start OpenIDM when it fails like that) so maybe it does still happen and we don't see it. But the fix you would do would not be applicable to 4.0.0?

Comment by Chris Drake [ 09/Dec/15 ]

Note that this issue appears to have been resolved in 4.0.0 and cannot be reproduced there. As such the changes are only being committed to the 3.1.x maintenance branch.

Comment by Chris Drake [ 08/Feb/16 ]

Reopening this issue as the original fix appears to:
1. Caused a minor regression
2. Not correctly resolved the issue

Comment by Chris Drake [ 11/Nov/16 ]

This is fixed in OpenIDM 4.5.0 and largely mitigated by the fix for OPENIDM-5150.

Comment by Quentin CASTEL [X] (Inactive) [ 20/Nov/16 ]

modification of the status, in order to migrate the 'Zendesk ID' field to 'Support Ticket ID' field.

Comment by Quentin CASTEL [X] (Inactive) [ 20/Nov/16 ]

modification of the status, in order to migrate the 'Zendesk ID' field to 'Support Ticket ID' field.

Generated at Sat Feb 27 03:39:27 UTC 2021 using Jira 7.13.12#713012-sha1:6e07c38070d5191bbf7353952ed38f111754533a.