Affects Version/s: 12.0.2, 12.0.3, 13.0.0
In order to facilitate large scale horizontal scaling of CTS, per session persistence should be implemented within CTS.
At present active/passive or active/active (if stickiness can be ensured) are the recommended CTS OpenDJ topologies, where all sessions are created in a particular OpenDJ instance. Such topologies do not allow sessions to span over multiple OpenDJ instances thus limiting horizontal scaling. Furthermore they introduce the need to vertically scale OpenDJ instances and manage replication delay.
The proposal is to implement per session persistence to OpenDJ whereby for example sessions 1, 3, 5 and persisted to OpenDJ 1 and 2, 4, 6 to OpenDJ 2. All CRUD activities then hit the master OpenDJ instance for that individual session with other OpenDJ instances acting as failover only. For example session 1 persists to OpenDJ 1, AM hits OpenDJ 1 for all activities related to that session and only interfaces with OpenDJ 2 for this session if OpenDJ 1 fails. AM hits OpenDJ 2 for all session 2 related activities and so on.
This ensure replication is in place only for failover purposes and not for functional reasons and allows massive horizontal scaling of the OpenDJ persistence layer with limited vertical scaling requirements.
The following should be considered for the design and implementation of such a solution:
1. Should a modulus function be used to determine which OpenDJ instance to store a session against or perhaps an approach similar to network devices - round robin (random) based on a list of servers and then sticky after that based on perhaps an additional attribute in the token object?
2. The solution must cater for the expansion and contraction of the OpenDJ layer without the need for OpenAM restarts. Ability for AM to target new OpenDJ instances are they appear automatically and also re-route CTS requests as instances disappear.
3. The solution should allow for priority listing to handle geographically dispersed implementations. For example if OpenDJ 1 and 2 are local and 3 and 4 are remote in another data centre, ensure OpenAM hits 1 or 2 if available and 3/4 only if 1/2 are unavailable.