Uploaded image for project: 'OpenIDM'
  1. OpenIDM
  2. OPENIDM-12249

Backport OPENIDM-12248: Data races in state shared across threads in recon

    Details

      Description

      Intermittently, about once a month for a test which runs daily, clustered recon will exhibit strange behavior. The test runs as follows:

      1. 100,000 users are reconciled in from DS into a bare IDM (sync with ldap bidirectional)
      2. the phone number of the 100,000 users is updated in DS
      3. another reconciliation from ldap->managed is triggered

      The error manifests itself with the following summary:

      The initial, 'create' recon:
      Reconciliation completed. SOURCE_IGNORED: 0 FOUND_ALREADY_LINKED: 0 UNQUALIFIED: 0 ABSENT: 100000 TARGET_IGNORED: 0 MISSING: 0 ALL_GONE: 0 UNASSIGNED: 0 AMBIGUOUS: 0 CONFIRMED: 0 LINK_ONLY: 0 SOURCE_MISSING: 0 FOUND: 0

      The subsequent, 'update' recon:
      INFO: Reconciliation completed. SOURCE_IGNORED: 0 FOUND_ALREADY_LINKED: 0 UNQUALIFIED: 0 ABSENT: 1 TARGET_IGNORED: 0 MISSING: 0 ALL_GONE: 0 UNASSIGNED: 1 AMBIGUOUS: 0 CONFIRMED: 99999 LINK_ONLY: 0 SOURCE_MISSING: 0 FOUND: 0

      The logs are clear of error when this occurs.

      Initial analysis:

      There is no correlation query in the sync-with-ldap-bidirectional sample. Thus, the presence/absence of link state determines the distinction between the ABSENT/CONFIRMED(UNASSIGNED) state. In other words, in the initial recon, if no link can be found for the ldap source object, the situation is ABSENT, and the object is created in managed. In the subsequent recon, if the link cannot be found, the situation will be ABSENT for the source phase, which correlates to UNASSIGNED for the target phase. In other words, the fact that 99999 records were CONFIRMED, and 1 was ABSENT and 1 was UNASSIGNED does not indicate that 101000 records were reconciled, but rather a missing link record will result in the ABSENT situation in the source phase, and an UNASSIGNED situation in the target phase.

      See https://ea.forgerock.com/docs/idm/integrators-guide/index.html#source-reconciliation for a helpful table of situations.

      An open mystery is why there are no error logs, as, in the 'update' recon, the ABSENT situation should result in a create attempt, which should be rejected due to a policy violation (duplicate userName). 

      Initial investigations revealed several memory visibility issues, prompting the filing of this JIRA. Note that direct causality between these memory visibility issues, and the state characterizing this intermittent failure, has not been established.   

       

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                naren.koganti Naren Koganti
                Reporter:
                mark.offutt Mark Offutt [X] (Inactive)
                QA Assignee:
                Michal Orlik
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: