Affects Version/s: 13.5.1, 13.5.2, 14.1.1, 5.5.1, 6.0.0, 220.127.116.11, 18.104.22.168, 22.214.171.124, 126.96.36.199, 188.8.131.52, 184.108.40.206, 6.5.0, 220.127.116.11
Environment:Must enable IDP SAML2 Failover to be true.
Note that this is NOT really solved by AM stickiness as the request affinity is due to the SP contacting the IDP and SP end have no access to stickiness cookie of AM.
Support Ticket IDs:
Needs QA verification:Yes
Are the reproduction steps defined?:Yes and I used the same an in the description
When having multiple IDP with SAML2 Failover and using Artifact binding where first IDP that request for the initial SAMLArt (Artifact) is send to the SP and then the SP does the Artifact resolve to another load balance IDP, then it is possible that because the first IDP and Artifact responding IDP is not the same, the following exception will happen
and in the Federation logs:
- Setup Federation for SP and IDP. You can create 2 load balance IDP if needed.
- Enable SAML2 Fail-over to be true for the IDP
- Perform a standard SP SSO (hopefully with Artifact binding)
It is important that the request from first IDP later goes to the 2nd IDP. There is hack you could even use DNS on SP host to target a another AM instance. (if you do not want to do any LB)
Note that this is NOT really solved by AM stickiness as the request affinity is due to the SP contacting the IDP and SP end have no access to stickiness cookie of AM. So stickiness from Browser to AM is not the issue and hence merely asking for stickiness would not apply here (as it is a external SP issue).
Alternate developer testcase (for single IDP machine test)
- In IDPArtifactResolution line 315, always drop the local cache so that it uses the CTS (this may be quicker to test things). Eg: like simulating a IDP restart.
Logic of the flow
This is best described in https://www.oasis-open.org/committees/download.php/27819/sstc-saml-tech-overview-2.0-cd-02.pdf in page 28 for the Artifact binding
In AM, when HTTP-Artifact lands on this AM2, there is code to try to see where the origin of this Artifact origin so that it know which originating AM has created this is. So internally it may do a cross-talk from this AM2 to the other one AM1. So if this portion work, this above CTS access may not happen [hence this cannot create mutable should not be seen) (as long as the origin AM has this in their local cache). So if this error happens it may mean also cross-talk is not working/failed. This may be perfectly fine for autonomous (where all the AM server's are having the same serverId) or say AM1 is restarted. However one may want to check that (for crosstalk for this to work):
- AM1 <-> AM2 is routable (DNS-wise for the registered Artifact Resolution URL)
- The AM1 & AM2 installed URL are valid/resolvable within these AM so that AM2 can contact AM1 using these URL
- That the AM1/2 internal serverID are 2 digits (not changed somehow)
- AM1 <-> AM2 communication issue (eg: HTTPS cert not trusted between AM1/2 due to crosstalk)
Stickiness will not solve this. But if there is an option avoid using Artifact binding and configure to use Redirect/POST binding.
When all the SAML2 object is created from XML, they are created as immutable. So if the object needs to be changed it cannot be recreated from XML
Here the response is serialized as XML string and later that is why
The issue is that
So the response is immutable.
Possible fix, just save the response. As response is IDP generated (hopefully not external), we can just save this to CTS and reconstruct to directly. This will probably need
OPENAM-12770 to work as we will rely on this to work.