[OPENAM-15776] Push Registration fails (QR code invalid) to register Created: 13/Dec/19  Updated: 05/May/20  Resolved: 20/Apr/20

Status: Closed
Project: OpenAM
Component/s: authentication
Affects Version/s: 6.5.2.2
Fix Version/s: 6.5.2.3, 5.5.2, 7.0.0, 6.5.3

Type: Bug Priority: Major
Reporter: C-Weng C Assignee: Adam Heath
Resolution: Fixed Votes: 0
Labels: EDISON, Must-Fix
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File Video MP4 (1680x898).mp4     File pushissue.mp4    
Issue Links:
Duplicate
Regression
Relates
Target Version/s:
Sprint: AM Sustaining Sprint 70, AM Sustaining Sprint 71
Story Points: 5

 Description   

Bug description

Push registration fails to work after upgrading to 6.5.2.2 (working fine on 6.5.2.1).  With message debug enable and when the QR code is registered this is seen in CoreSystem

 CoreSystem
==========
frRest:12/13/2019 02:36:11:859 PM SGT: Thread[http-nio-8080-exec-7,5,main]: Tran
sactionId[4e09f01f-07f3-4516-96a8-e6bd2f1baff1-2025]
ERROR: A runtime exception occurred during the CREST request handling
java.lang.IllegalStateException: Exception from invocation expected to be handle
d by promise
        at org.forgerock.json.resource.AnnotatedMethod.invoke(AnnotatedMethod.ja
va:100)
        at org.forgerock.json.resource.AnnotatedMethod.invoke(AnnotatedMethod.java:65)
        at org.forgerock.json.resource.AnnotatedActionMethods.invoke(AnnotatedActionMethods.java:43)
        at org.forgerock.json.resource.AnnotatedActionMethods.invoke(AnnotatedActionMethods.java:31)
        at org.forgerock.json.resource.AnnotatedRequestHandler.handleAction(AnnotatedRequestHandler.java:81)
....
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.forgerock.json.resource.AnnotatedMethod.invoke(AnnotatedMethod.java:96)
        ... 108 more
Caused by: java.lang.NoSuchMethodError: org.codehaus.stax2.ri.EmptyIterator.getInstance()Lorg/codehaus/stax2/ri/EmptyIterator;
        at com.ctc.wstx.evt.CompactStartElement.getAttributes(CompactStartElement.java:94)
        at com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:214)
        at com.amazonaws.services.sns.model.transform.CreatePlatformEndpointResultStaxUnmarshaller.unmarshall(CreatePlatformEndpointResultStaxUnmarshaller.java:40)
        at com.amazonaws.services.sns.model.transform.CreatePlatformEndpointResultStaxUnmarshaller.unmarshall(CreatePlatformEndpointResultStaxUnmarshaller.java:28)
        at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:107)
        at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:40)
        at com.amazonaws.http.response.AwsResponseHandlerAdapter.handle(AwsResponseHandlerAdapter.java:70)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleResponse(AmazonHttpClient.java:1554)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1272)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513)
        at com.amazonaws.services.sns.AmazonSNSClient.doInvoke(AmazonSNSClient.java:2270)
        at com.amazonaws.services.sns.AmazonSNSClient.invoke(AmazonSNSClient.java:2246)
        at com.amazonaws.services.sns.AmazonSNSClient.executeCreatePlatformEndpoint(AmazonSNSClient.java:615)
        at com.amazonaws.services.sns.AmazonSNSClient.createPlatformEndpoint(AmazonSNSClient.java:591)
        at org.forgerock.openam.services.push.sns.utils.SnsPushResponseUpdater.setCommunicationId(SnsPushResponseUpdater.java:112)
        at org.forgerock.openam.services.push.sns.utils.SnsPushResponseUpdater.updateCommunicationId(SnsPushResponseUpdater.java:84)
        at org.forgerock.openam.services.push.sns.utils.SnsPushResponseUpdater.updateResponse(SnsPushResponseUpdater.java:66)
        at org.forgerock.openam.services.push.sns.SnsRegistrationPredicate.perform(SnsRegistrationPredicate.java:73)
        at org.forgerock.openam.services.push.dispatch.MessageDispatcher.handlePushResponseLocally(MessageDispatcher.java:206)
        at org.forgerock.openam.services.push.dispatch.MessageDispatcher.handle(MessageDispatcher.java:122)
        at org.forgerock.openam.services.push.PushMessageResource.handle(PushMessageResource.java:102)
        at org.forgerock.openam.services.push.sns.SnsMessageResource.register(SnsMessageResource.java:82)
        ... 113 more

or pay attention to

Caused by: java.lang.NoSuchMethodError: org.codehaus.stax2.ri.EmptyIterator.getInstance()Lorg/codehaus/stax2/ri/EmptyIterator;
        at com.ctc.wstx.evt.CompactStartElement.getAttributes(CompactStartElement.java:94)
        at com.amazonaws.transform.StaxUnmarshallerContext.nextEvent(StaxUnmarshallerContext.java:214)
        at com.amazonaws.services.sns.model.transform.CreatePlatformEndpointResultStaxUnmarshaller.unmarshall(CreatePlatformEndpointResultStaxUnmarshaller.java:40)

How to reproduce the issue

Details steps outlining how to recreate the issue (remove this text)

  1. Setup Push registration and test it works on 6.5.2.1
  2. Upgrade to 6.5.2.2
  3. Failure is seen (QR is invalid) when trying to register push.
  4. Best to enable message debug to see all the errors
Expected behaviour
Push registration works
Current behaviour
Push registration does not work anymore

Work around

Maybe remove openam/WEB-INF/lib/woodstox-core-asl-4.3.0.jar

--- a/openam-server-only/pom.xml
+++ b/openam-server-only/pom.xml
@@ -161,6 +161,7 @@
 WEB-INF/lib/jaxb-libs-1.0.6.jar,
 WEB-INF/lib/jaxb-xjc-1.0.6.jar,
 WEB-INF/lib/jaxp-api-1.4.2.jar,
+ WEB-INF/lib/woodstox-core-asl-4.3.0.jar,
 </packagingExcludes>
 <archive>
 <manifestEntries>

Cause

Cause by the Jackson dependencies changes upgrade and the Woodstox dependencies due to Jackson upgrade. See  to https://github.com/FasterXML/woodstox/issues/46 if there is old woodstox.



 Comments   
Comment by Adam Heath [ 13/Dec/19 ]

I am unable to reproduce this when configuring Push auth as per the steps in https://backstage.forgerock.com/knowledge/cs/article/a42796832 (see attached video) - C-Weng C is there anything else you have configured to reproduce this issue?

Video MP4 (1680x898).mp4

re: the referenced https://github.com/FasterXML/woodstox/issues/46 - I think we should have already covered this as part of the changes noted and discussed in https://bugster.forgerock.org/jira/browse/OPENAM-15000 - which pin the version of woodstox-core to 5.3.0

Video MP4 (1680x898).mp4

Comment by Jonathan Thomas [ 13/Dec/19 ]

With NoSuchMethodError one thing to check is clearing the expanded war and container caches before the upgrade .

Comment by C-Weng C [ 13/Dec/19 ]

The issue is that a clean install will also show. Anyway, i suppose it depends how the classloading woodstox get loaded first on which is resolved.

test.jsp

<%@ page import="com.ctc.wstx.evt.*" %>
<%
   DefaultEventAllocator alloc = DefaultEventAllocator.getDefaultInstance();
   Object o = alloc.withStartElement(null, null, null, null, true);
   out.println(o.getClass()+" "+o);
   if (o instanceof CompactStartElement) {
       try {
           ((CompactStartElement)o).getAttributes();
       } catch (Throwable ex) {
           ex.printStackTrace();
           ex.printStackTrace(new java.io.PrintWriter(out,true));

       }
   }
%>

Maybe you can dump this to the openam and test this to see what loaded? The issue is not about pinning to woodstox-5.3.0 but that there are duplicate woodstox in the openam.war and in my case the 4.3.0 version is loaded.

pushissue.mp4

Comment by Adam Heath [ 16/Dec/19 ]

Thanks for the additional info C-Weng C - I can confirm I loaded the test.jsp you provided in your last update and noticed no errors (just a `class com.ctc.wstx.evt.CompactStartElement Stax Event #1` message)

I think I understand a little more about the issue now though given your notes above. It appears that there were some changes introduce in woodstox-core versions after 5.1.0 that have resulted in APIs that are non-binary compatible with earlier versions (see https://github.com/FasterXML/woodstox/issues/46 which you referenced earlier)

For now, the only way I can see us resolving this would be to pin this version back to a pre 5.1.0 version (5.0.3 is the latest stable) so that all the related libraries (woodstox-core 5.0.3  and woodstox-core-asl 4.3.0) remain binary compatible. However, this would mean that we miss out on all the bugfixes which have been provided in later 5.x releases https://github.com/FasterXML/woodstox/wiki/Woodstox-Release-5.3 and this may be something that we need to monitor in the future should other items require later versions of this dependency.
 
C-Weng C Appreciate that this is non-deterministic, which is why it is not possible to consistently reproduce the issue, but it appears you are having more luck at doing so than I have  - would you be able to share details about your environment such as the exact tomcat and JDK version to see if I can replicate exactly the same setup and stand a better chance of reproducing in order to better test this fix.

Comment by Adam Heath [ 19/Dec/19 ]

Having done a good amount of investigation into this am going to add a summary of the issue here so that it is hopefully clearer for anyone else looking at this PR or coming from the JIRA.

The exceptions being observed are essentially boiling down to incompatibilities between the versions of woodstox (either woodstox-core (covering 5.x and above, or woodstox-core-asl - legacy covering 4.x and below) and the stax2-api libraries.

Since the update to jackson 2.9.10/2.10 we have had to bump and pin the versions to the following in all branches where restlet is still present (so 6.5 and earlier):

            <!-- There are incompatible versions of woodstox-core and stax2-api dependencies when using
            jackson-dataformat-xml 2.9.10 (see https://github.com/FasterXML/jackson-dataformat-xml/issues/340)
            Pinning the version here to prevent dependency convergence errors -->
            <dependency>
                <groupId>com.fasterxml.woodstox</groupId>
                <artifactId>woodstox-core</artifactId>
                <version>5.3.0</version>
            </dependency>
            <dependency>
                <groupId>org.codehaus.woodstox</groupId>
                <artifactId>stax2-api</artifactId>
                <version>4.2</version>
            </dependency>

This has then caused issues in the instances where the woodstox-core-asl library is still in use within the project (e.g. currently within restlet, and via transitive dependencies in jaxws-rt and wss4j.policy), as woodstox-core-asl is expecting to use stax2-api 3.1.4 which has now been bumped by the above to 4.2...and so causing the exception currently being experienced.

4.x of stax2-api was introduced as required by woodstox core from version 5.1.0 onwards, and woodstox core >5.1.0 was introduced in the changes involved in updating jackson to 2.9.10 and above.

It has not been possible to pin the versions to the older 5.0.3 and 3.1.4 versions respectively, as despite these being compatible with one another there will potentially be issues for restlet which is now expecting the newer stax2-api (4.2) via a transitive dependency pulled in from jackson 2.10, see:

Dependency convergence error for org.codehaus.woodstox:stax2-api:4.2 paths to dependency are:
+-org.forgerock.am:openam-restlet:6.5.3-SNAPSHOT
  +-org.restlet.jee:org.restlet.ext.jackson:2.3.12
    +-com.fasterxml.jackson.dataformat:jackson-dataformat-xml:2.10.0
      +-org.codehaus.woodstox:stax2-api:4.2
and
+-org.forgerock.am:openam-restlet:6.5.3-SNAPSHOT
  +-org.restlet.jee:org.restlet.ext.jackson:2.3.12
    +-com.fasterxml.jackson.dataformat:jackson-dataformat-xml:2.10.0
      +-com.fasterxml.woodstox:woodstox-core:5.0.3
        +-org.codehaus.woodstox:stax2-api:3.1.4
and
+-org.forgerock.am:openam-restlet:6.5.3-SNAPSHOT
  +-org.restlet.jee:org.restlet.ext.jackson:2.3.12
    +-org.codehaus.woodstox:woodstox-core-asl:4.3.0
      +-org.codehaus.woodstox:stax2-api:3.1.4
and
+-org.forgerock.am:openam-restlet:6.5.3-SNAPSHOT
  +-org.restlet.jee:org.restlet.ext.jackson:2.3.12
    +-org.codehaus.woodstox:stax2-api:3.1.4

Leaving the versions at 5.3.0 and 4.2 will require us to add an exclusion against the org.restlet.ext.jackson dependency (as seen in this attached PR) which will then prevent the conflicting versions of stax2-api from being referenced. However, there are still other libraries which pull in this dependency (mentioned above: jaxws-rt and wss4j.policy) and so at the moment, it is unclear if having the remaining inclusions of woodstox-core-asl will still cause these issues to be present. C-Weng C is this something you may be able to help confirm with your set-up? (I am still unable to reproduce this here)

The underlying issue is summed up well (eventually) in a github issue raised against jackson-data-format here: https://github.com/FasterXML/jackson-dataformat-xml/issues/340 - with the key point being highlighted as the non-binary compatible change to the EmptyIterator.getInstance() method - and quite frankly, it all looks to be a bit of a mess caused by these version updates in jackson

Comment by Adam Heath [ 09/Jan/20 ]

During testing, this issue can not be avoided whenever the woodstox-core-asl dependency is present in the resulting WEB-INF/lib - because of this the approach to fix this has now been to exclude the legacy woodstox-core-asl dependency where it is still currently being referenced in the project. This should be possible due to us already including the updated woodstox-core 5.2.x throughout alongside the existing bump to the compatible stax2-api 4.2 mentioned above - this is also similar to the approach taken in OPENAM-14041 when this issue had previously been flagged. 

Comment by Ľubomír Mlích [ 10/Feb/20 ]

No verification as I'm not able to reproduce

Comment by Adam Heath [ 20/Apr/20 ]

Re-opening to clarify the bug title and not explicitly mention AM 6.5.2.2 (As other versions are impacted) - this title appears in release notes

Generated at Sat Nov 28 11:49:45 UTC 2020 using Jira 7.13.12#713012-sha1:6e07c38070d5191bbf7353952ed38f111754533a.