Uploaded image for project: 'OpenAM'
  1. OpenAM
  2. OPENAM-10874

Adding a new server (site) requires a Restart on all the old servers

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 13.5.0
    • Fix Version/s: 13.5.2
    • Component/s: session
    • Labels:
    • Environment:
      AWS Elastic provision w/o sticky
    • Sprint:
      AM Sustaining Sprint 45
    • Story Points:
      3
    • Needs backport:
      No
    • Support Ticket IDs:
    • Verified Version/s:
    • Needs QA verification:
      Yes
    • Functional tests:
      No
    • Are the reproduction steps defined?:
      Yes but I used my own steps. (If so, please add them in a new comment)

      Description

      NOTE: This bug only applies to 13.5.x or prior version as AM 5 onward will not be using session crosstalk.

      Background context
      (OPENAM-480 may not had addressed)
      The setup for this is to perform adding multiple new OpenAM instance and hoping that the session are HA and the rest. Here say we create a OpenAM server with an external configuration and they are in a site lb. THis can be scripted by a configurator when installing OpenAM as

      LB_SITE_NAME=lb
      LB_PRIMARY_URL=http://lb/openam
      LB_SESSION_HA_SFO=true
      

      Now say this is done, and next another OpenAM server 2 is created. So now we have created two server instance and all the sessions are also going to CTS. Let assuming that we run a ssoadm config to setup things too for these instance to be session HA and CTS is fine.

      Before creating a new third server, let create a few authentication access and do some cross-talk. (This is an important ste. We will come back to it but this is like having in real-life some server running for some time)

      Now create a third instance 03, using the same procedure. Now you can try creating sessions and accessing them on other servers.

      Problem
      The problem is not easily seen as it is masked out by the OpenAM code where when a request for the Sessions (say created on the new instance 03) is accessed on 01, the code will try to do a x-talk. Ideally you will think that it should route to 03 but in reality, the code's will try to use the session's "sid" and the locateCurrentHostServer will not find server 03. So the next available one is 02.

      The issue here's that this is x-talked to 02 instead of 03 (where the session is and where server 03 is just started). Now no-doubt that as a failsafe when it is route to 02 (it is likely that it may eventually do the CTS).

      In the user's case, this issue is seen where the CTS is not operating and so the when the request is routed to this instance.

      In short, it seems when a new instance is created and added, although the Naming service reflected that the new instance is added and ther other knows. The issue is that the old servers for the "ClusterStateService" does not know.

      Investigation Summary
      1. SessionServiceConfig, as the name implies, is only checking changes that happens in global session service. However, when a new server instance is added to an existing site which already have session failover setup, the change comes from platform service rather than session service so session doesn't get stored in CTS.

      2. SessionService doesn't listen to changes on platform service so SessionService.isClusterMonitorValid() method will always return true after it's been initialized once.

      The bottom line suggest that a restart will be require

      *To simplify testing*
      You can probably do
      a) Install 2 openam with HA enabled and site
      b) You can restart them as you like
      c) Force session creation and x-talk to have it initialize the MultiServerClusterMonitor
      d) You can now install the 3 openam.
      e) You can debug the ClusterStateService to notice the state never change
      f) In fact if OPENAM-480 statement of "Adding a server to a site requires restart of OpenAM"
      is true, if you manipulate the sites/servers. you should at least
      see "ClusterStateService" for all the server get updated to reflect the membership.

        Attachments

        1. 13.5.0.png
          13.5.0.png
          14 kB
        2. 13.5.2-M11.png
          13.5.2-M11.png
          11 kB

          Issue Links

            Activity

              People

              • Assignee:
                sachiko Sachiko Wallace
                Reporter:
                chee-weng.chea C-Weng C
              • Votes:
                0 Vote for this issue
                Watchers:
                9 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: