Uploaded image for project: 'OpenIDM'
  1. OpenIDM
  2. OPENIDM-12413

Multi-nodes clustered recon may fail with wrong situation

    Details

    • Target Version/s:
    • Verified Version/s:
    • Story Points:
      5
    • Sprint:
      OpenIDM Sprint 7.0-2

      Description

      When run 5 IDM nodes clustered recon create from DJ to IDM to reconcile 100K users, the recon may fail randomly. Recon status showed that

      20190116 07:10:34.181 - INFO - Recon state = {'_id': 'c2d9eb2b-020a-4736-bcd6-5e5db281b2c3-166', 'mapping': 'systemLdapAccounts_managedUser', 'state': 'FAILED', 'stage': 'COMPLETED_FAILED', 'stageDescription': 'reconciliation failed', 'progress': {'source': {'existing': {'processed': 178846, 'total': '178000'}}, 'target': {'existing': {'processed': 57355, 'total': '121673'}, 'created': 121673}, 'links': {'existing': {'processed': 57173, 'total': '?'}, 'created': 100000}}, 'situationSummary': {'SOURCE_IGNORED': 0, 'FOUND_ALREADY_LINKED': 0, 'UNQUALIFIED': 0, 'ABSENT': 121673, 'TARGET_IGNORED': 0, 'MISSING': 0, 'ALL_GONE': 0, 'UNASSIGNED': 182, 'AMBIGUOUS': 0, 'CONFIRMED': 57173, 'LINK_ONLY': 0, 'SOURCE_MISSING': 0, 'FOUND': 0}, 'statusSummary': {'SUCCESS': 179028, 'FAILURE': 0}, 'durationSummary': {'sourceQuery': {'min': 319, 'max': 1909, 'mean': 509, 'count': 178, 'sum': 90731, 'stdDev': 218}, 'targetIdPersistence': {'min': 2, 'max': 20, 'mean': 4, 'count': 178, 'sum': 748, 'stdDev': 2}, 'auditLog': {'min': 0, 'max': 40, 'mean': 0, 'count': 179026, 'sum': 34380, 'stdDev': 0}, 'onReconScript': {'min': 107, 'max': 107, 'mean': 107, 'count': 1, 'sum': 107, 'stdDev': 0}, 'targetQuery': {'min': 0, 'max': 955, 'mean': 955, 'count': 1, 'sum': 955, 'stdDev': 0}, 'postMappingScript': {'min': 0, 'max': 41, 'mean': 0, 'count': 178846, 'sum': 17943, 'stdDev': 0}, 'sourceObjectQuery': {'min': 0, 'max': 116, 'mean': 4, 'count': 847, 'sum': 3733, 'stdDev': 0}, 'sourceLinkQuery': {'min': 0, 'max': 103, 'mean': 0, 'count': 178000, 'sum': 130884, 'stdDev': 0}, 'defaultMappingScript': {'min': 0, 'max': 32, 'mean': 0, 'count': 178846, 'sum': 12422, 'stdDev': 0}, 'sourcePagePhase': {'min': 302, 'max': 6915, 'mean': 1804, 'count': 178, 'sum': 321254, 'stdDev': 228}, 'targetLinkQuery': {'min': 0, 'max': 9, 'mean': 0, 'count': 1033, 'sum': 795, 'stdDev': 0}, 'targetObjectQuery': {'min': 0, 'max': 107, 'mean': 2, 'count': 57173, 'sum': 154190, 'stdDev': 0}}, 'parameters': {'sourceQuery': {'resourceName': 'system/ldap/account', 'queryFilter': 'true'}, 'targetQuery': {'resourceName': 'managed/user', 'queryId': 'query-all-ids'}}, 'started': '2019-01-16T06:07:01.876Z', 'ended': '2019-01-16T06:10:32.037Z', 'duration': 210161, 'sourceProcessedByNode': {'node4': 37000, 'node5': 35000, 'node2': 33000, 'node3': 37846, 'node1': 36000}}
      

      Note that this test is only for recon create, IDM repo was cleaned before the recon, but the recon status indicated that IDM accessed some situation as CONFIRMED and UNASSIGNED somehow.

      one of the IDM node logs has many entries like this:

      [728] Jan 16, 2019 7:11:23.140 AM org.forgerock.openidm.sync.TargetSyncOperation assessSituation
      INFO: Situation in target reconciliation that indicates source may have changed CONFIRMED { "description": "This is the description for Faustine Dancy.", "employeeType": [  ], "telephoneNumber": "+1 905 038 7401", "cn": "Faustine Dancy", "dn": "uid=F-D-69800,ou=People,dc=example,dc=com", "uid": "F-D-69800", "mail": "F-D-69800@example.com", "objectClass": [ "person", "organizationalPerson", "inetOrgPerson", "top" ], "ldapGroups": [  ], "givenName": "Faustine", "sn": "Dancy", "kbaInfo": [  ], "aliasList": [  ], "_id": "eae62b8c-a54d-3037-a6b2-fd4fd637a424" } 7aea525e-bfb4-4a61-8409-38723d92102d { "_id": "59ec6cc5-3d4d-4ee7-b5b6-b7c959bf2e5c", "_rev": "0", "linkType": "systemLdapAccounts_managedUser", "linkQualifier": "default", "firstId": "eae62b8c-a54d-3037-a6b2-fd4fd637a424", "secondId": "7aea525e-bfb4-4a61-8409-38723d92102d" }
      [731] Jan 16, 2019 7:11:23.140 AM org.forgerock.openidm.sync.TargetSyncOperation assessSituation
      INFO: Situation in target reconciliation that indicates source may have changed CONFIRMED { "description": "This is the description for Sedigheh Longfield.", "employeeType": [  ], "telephoneNumber": "+1 861 863 9430", "cn": "Sedigheh Longfield", "dn": "uid=S-L-34012,ou=People,dc=example,dc=com", "uid": "S-L-34012", "mail": "S-L-34012@example.com", "objectClass": [ "person", "organizationalPerson", "inetOrgPerson", "top" ], "ldapGroups": [  ], "givenName": "Sedigheh", "sn": "Longfield", "kbaInfo": [  ], "aliasList": [  ], "_id": "fa90856e-fa3e-34dc-a06d-0cf7478a2930" } f0487963-3bf5-4cd6-a5d2-f3d2db3d9398 { "_id": "b8107ea9-b6fd-4ee7-a0ba-98360aa21ed2", "_rev": "1", "linkType": "systemLdapAccounts_managedUser", "linkQualifier": "default", "firstId": "fa90856e-fa3e-34dc-a06d-0cf7478a2930", "secondId": "f0487963-3bf5-4cd6-a5d2-f3d2db3d9398" }
      

      To reproduce it using Pyforge:
      Set up stress parameters in pyforge/config/config.cfg as

      [Stress]
      num_users = 100
      duration = 1000
      

      and use IDM 7.0.0 SNAPSHOT in OpenIDM section.
      and run command
      nohup python3 run-pybot.py -n -c stress -s idm.clustered_recon.ReconLDAPToManUserInCluster5Nodes OpenIDM

      and possibly see the symptom.
      It has been seen 3 times in about 15 runs.
      All the log files for IDM and DJ are put under folder /home/testuser/clustered_recon_issue on 172.16.205.43

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dhogan Dirk Hogan
                Reporter:
                Tinghua.Xu Tinghua.Xu
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: