Uploaded image for project: 'OpenAM'
  1. OpenAM
  2. OPENAM-15068

dynamic configstores load then check schema on different DS servers

    Details

    • Type: Bug
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Target Version/s:
    • Sprint:
      AM 2019.9 - Crane, AM 2019.10 - Cast Iron

      Description

      Bug description

      We have found AM has been intermittently failing (about 70%) at configuring dynamic config stores on kubernetes clusters when our pipeline runs.

      On closer inspection we found that AM configures schema on the store (LDAP ADD) then immediately checks to see if the  schema was applied (LDAP SEARCH). These two operations happen against different LDAP servers despite affinity mode being enabled.

      The operations happen approx 12ms apart which more often than not does not leave DS enough time to replicate this change between our data centers.

      Amster Error

       Failed to import... 500 Internal Server Error: Failed to create root realm for data store DataStoreId{id='oauth2 client store'}

      AM Exception

      { insertId: "1ne22eqftr47e7"   jsonPayload: {  X-B3-SpanId: "24"     X-B3-TraceId: "http-nio-8080-exec-8"     logger_name: "amSMSLdap"     message: "SMSLdapObject.create() Error in creating: LdapSmsEntryUid{dn='ou=sunEntitlementService,ou=services,ou=am-config', dataStoreId=DataStoreId{id='oauth2 client store'}}"     project-name: "AM"     spanId: "24"     stackTrace: "No Such Entry: Entry ou=sunEntitlementService,ou=services,ou=am-config cannot be added because its parent entry ou=services,ou=am-config does not exist in the server
      	at org.forgerock.opendj.ldap.LdapException.newLdapException(LdapException.java:246)
      	at org.forgerock.opendj.ldap.LdapClientImpl$Exchange.onNext(LdapClientImpl.java:702)
      	at org.forgerock.opendj.ldap.LdapClientImpl$Exchange.onNext(LdapClientImpl.java:616)
      	at org.forgerock.opendj.ldap.DemultiplexerImpl$DemultiplexedStream.tryOnNextFastPath(DemultiplexerImpl.java:432)
      	at org.forgerock.opendj.ldap.DemultiplexerImpl$DemultiplexedStream.onNextAndOptionallyComplete(DemultiplexerImpl.java:392)
      	at org.forgerock.opendj.ldap.DemultiplexerImpl.onNext(DemultiplexerImpl.java:162)
      	at io.reactivex.internal.operators.flowable.FlowableDoOnEach$DoOnEachSubscriber.onNext(FlowableDoOnEach.java:92)
      	at io.reactivex.internal.operators.flowable.FlowableOnErrorNext$OnErrorNextSubscriber.onNext(FlowableOnErrorNext.java:69)
      	at io.reactivex.internal.operators.flowable.FlowableFilter$FilterSubscriber.tryOnNext(FlowableFilter.java:74)
      	at io.reactivex.internal.operators.flowable.FlowableFilter$FilterSubscriber.onNext(FlowableFilter.java:52)
      	at io.reactivex.internal.operators.flowable.FlowableDoOnEach$DoOnEachConditionalSubscriber.onNext(FlowableDoOnEach.java:231)
      	at org.forgerock.opendj.grizzly.GrizzlyLdapSocketFilter$GrizzlyReader.handleRead(GrizzlyLdapSocketFilter.java:331)
      	at org.forgerock.opendj.grizzly.GrizzlyLdapSocketFilter.handleRead(GrizzlyLdapSocketFilter.java:135)
      	at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:119)
      	at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:284)
      	at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:201)
      	at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:133)
      	at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:112)
      	at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77)
      	at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:539)
      	at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112)
      	at org.glassfish.grizzly.strategies.SameThreadIOStrategy.executeIoEvent(SameThreadIOStrategy.java:103)
      	at org.glassfish.grizzly.strategies.AbstractIOStrategy.executeIoEvent(AbstractIOStrategy.java:89)
      	at org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:415)
      	at org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:384)
      	at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:348)
      	at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:279)
      	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593)
      	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573)
      	at java.lang.Thread.run(Thread.java:748)
      "     thread_name: "Thread[http-nio-8080-exec-8,5,main]"     traceId: "http-nio-8080-exec-8"     transactionId: "TransactionId[08d67d21-1e8b-42c0-82a3-51b9ad649170-4284]"    } labels: {…}     receiveTimestamp: "2019-06-11T16:24:39.419372226Z"   resource: {…}   severity: "ERROR"   timestamp: "2019-06-11T16:24:34.192Z"  } 

      LDAP ADD Operation

      This is the LDAP access log event showing the succesful ADD operation on fr-ds-dynamic-configstore-1.

       {
       insertId:  "opba58g2sjjvn2"  
       jsonPayload: {
        _eventId:  "88d5c898-cd22-46e2-a644-1357ff1f9a01-121"   
        client: {…}   
        eventName:  "DJ-LDAP"   
        level:  "INFO"   
        request: {
         connId:  2    
         dn:  "ou=services,ou=am-config"    
         msgId:  23    
         operation:  "ADD"    
         protocol:  "LDAP"    
        }
        response: {
         elapsedTime:  8    
         elapsedTimeUnits:  "MILLISECONDS"    
         status:  "SUCCESSFUL"    
         statusCode:  "0"    
        }
        server: {…}   
        source:  "audit"   
        timestamp:  "2019-06-11T16:24:34.168Z"   
        topic:  "ldap-access"   
        transactionId:  "88d5c898-cd22-46e2-a644-1357ff1f9a01-119"   
       }
       labels: {  
        container.googleapis.com/namespace_name:  "default"   
        container.googleapis.com/pod_name:  "fr-ds-dynamic-configstore-1"   
        container.googleapis.com/stream:  "stdout"   
       }
       receiveTimestamp:  "2019-06-11T16:24:38.763005976Z"  
       resource: {…}  
       severity:  "INFO"  
       timestamp:  "2019-06-11T16:24:34.168782619Z"  
      }

      LDAP SEARCH Operation

      This is the failing subsequent search on fr-ds-dynamic-configstore-0:

      {
       insertId:  "z8wwdcg2sdofi0"  
       jsonPayload: {
        _eventId:  "bf4525b2-aef0-4bbf-946d-593232962881-53"   
        client: {
         ip:  "10.20.0.50"    
         port:  57436    
        }
        eventName:  "DJ-LDAP"   
        level:  "INFO"   
        request: {
         attrs: [10]    
         connId:  2    
         dn:  "ou=sunEntitlementService,ou=services,ou=am-config"    
         filter:  "(objectClass=*)"    
         msgId:  7    
         operation:  "SEARCH"    
         protocol:  "LDAP"    
         scope:  "base"    
        }
        response: {
         detail:  "The search base entry 'ou=sunEntitlementService,ou=services,ou=am-config' does not exist"    
         elapsedTime:  11    
         elapsedTimeUnits:  "MILLISECONDS"    
         nentries:  0    
         status:  "FAILED"    
         statusCode:  "32"    
        }
        server: {…}   
        source:  "audit"   
        timestamp:  "2019-06-11T16:24:34.182Z"   
        topic:  "ldap-access"   
        transactionId:  "bf4525b2-aef0-4bbf-946d-593232962881-51"   
       }
       labels: {
        container.googleapis.com/namespace_name:  "default"   
        container.googleapis.com/pod_name:  "fr-ds-dynamic-configstore-0"   
        container.googleapis.com/stream:  "stdout"   
       }  
       receiveTimestamp:  "2019-06-11T16:24:35.964577338Z"  
       resource: {…}  
       severity:  "INFO"  
       timestamp:  "2019-06-11T16:24:34.185588339Z"  
      } 

      How to reproduce the issue

      Tough to reproduce this in a typical dev-laptop env. I suggest you set up a forgeops deployment, configure multiple dynamic configstores then use something like WANEM to add latency to the network. __ 

      Code analysis

      The two operations seem to be `loadSchema()` and `ensureRootRealmStructureIsPresent()` here:

      https://stash.forgerock.org/projects/OPENAM/repos/openam/browse/openam-core-rest/src/main/java/org/forgerock/openam/core/rest/sms/SmsGlobalDataStoreConfigProvider.java#180

      This appears to use two different types of connection, one for running the search and one for running the ADD, which explains why affinity mode is not working.

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                simon.harding Simon Harding
              • Votes:
                1 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                • Created:
                  Updated: