Uploaded image for project: 'OpenIDM'
  1. OpenIDM
  2. OPENIDM-13846

Incorrect mappings for relationships in repo.jdbc

    XMLWordPrintable

    Details

    • Support Ticket IDs:

      Description

      opendj.pqsql creates an explicit table for relationships.
       In spite of that, repo.jdbc defines a generic mapping for relationships:

      "relationships" : {
                      "mainTable" : "relationships",
                      "objectToColumn": {
                          "firstResourceCollection" : "firstresourcecollection",
                          "firstResourceId" : "firstresourceid",
                          "firstPropertyName" : "firstpropertyname",
                          "secondResourceCollection" : "secondresourcecollection",
                          "secondResourceId" : "secondresourceid",
                          "secondPropertyName" : "secondpropertyname"
                      }
                  },
      

      This is referenced by the relationship-related pre-defined queries for generic mappings:

      "find-relationships-for-resource" : "SELECT * FROM ${_dbSchema}.${_mainTable}...
      "find-relationship-edges" : "SELECT * FROM ${_dbSchema}.${_mainTable}...

      As a result, some IDM queries are translated into inefficient and slow SQL selects using json_extract_path on the full object, rather than referencing the columns directly, which has a huge performance impact.

      For example, a query like

      managed/user/<id>/roles _queryFilter=true

      results in the following SQL statement:

      SELECT obj.fullobject, obj.objectid, obj.rev, obj.firstresourcecollection, obj.firstresourceid, obj.firstpropertyname, obj.secondresourcecollection, obj.secondresourceid, obj.secondpropertyname FROM openidm.relationships obj INNER JOIN openidm.objecttypes objtype ON (objtype.id = obj.objecttypes_id AND objtype.objecttype = $1) WHERE ((((json_extract_path_text(obj.fullobject, $2) = ($3)) AND (json_extract_path_text(obj.fullobject, $4) = ($5)) AND (json_extract_path_text(obj.fullobject, $6) = ($7))) AND 1 = 1) OR (((json_extract_path_text(obj.fullobject, $8) = ($9)) AND (json_extract_path_text(obj.fullobject, $10) = ($11)) AND (json_extract_path_text(obj.fullobject, $12) = ($13))) AND 1 = 1)) ORDER BY json_extract_path_text(fullobject, $14) ASC LIMIT $15 OFFSET $16","parameters: $1 = 'relationships', $2 = 'firstResourceCollection', $3 = 'managed/user', $4 = 'firstResourceId', $5 = '8d3afcc5-60e3-4412-920c-476aba64e201', $6 = 'firstPropertyName', $7 = 'roles', $8 = 'secondResourceCollection', $9 = 'managed/user', $10 = 'secondResourceId', $11 = '8d3afcc5-60e3-4412-920c-476aba64e201', $12 = 'secondPropertyName', $13 = 'roles', $14 = '_id', $15 = '51', $16 = '0'
      

      Against a repo with several millions of users and relationships, the above takes about 80 seconds to return 1 result.
      On the other hand, querying

      managed/user/<id> _fields=roles

      returns the same result in 177ms and uses a SQL query that references columns directly, like this (using roles or authzRoles makes no difference):

       SELECT obj.fullobject, obj.objectid, obj.rev, obj.firstresourcecollection, obj.firstresourceid, obj.firstpropertyname, obj.secondresourcecollection, obj.secondresourceid, obj.secondpropertyname, obj.properties FROM openidm.relationships obj INNER JOIN openidm.objecttypes objtype ON (objtype.id = obj.objecttypes_id AND objtype.objecttype = 'relationships') WHERE (((obj.firstresourcecollection = ('managed/user')) AND obj.firstresourceid = ('579e4227-fd41-4e40-a648-6a8927465162')) AND obj.firstpropertyname = ('authzRoles'))) AND 1 = 1) OR ((obj.secondresourcecollection = ('managed/user')) AND obj.secondresourceid = ('579e4227-fd41-4e40-a648-6a8927465162')) AND obj.secondpropertyname = ('authzRoles'))) AND 1 = 1)) LIMIT '2147483647' OFFSET '0'

      This can be fixed by moving the relationship mappings to "explicitMapping" in repo.jdbc and making the necessary adjustments to the pre-defined queries. With this in place, both IDM queries translate to the more efficient direct mapping:

      "queries":{
      ...
      "explicitTables": {
       "find-relationships-for-resource" : "SELECT * FROM ${_dbSchema}.${_table} WHERE ...
       "find-relationship-edges" : "SELECT * FROM ${_dbSchema}.${_table} WHERE...
       }
      ...
      "resourceMapping": {
                  ...
      "explicitMapping":{
      "relationships" : {
                      "table" : "relationships",
                      "objectToColumn": {
                          "_id" : "objectid",
                          "_rev" : "rev",
                          "firstResourceCollection" : "firstresourcecollection",
                          "firstResourceId" : "firstresourceid",
                          "firstPropertyName" : "firstpropertyname",
                          "secondResourceCollection" : "secondresourcecollection",
                          "secondResourceId" : "secondresourceid",
                          "secondPropertyName" : "secondpropertyname"
                      }
                  },
      ...
      

      Since the Admin UI appears to make preferential use of querying properties as a collection, this may also be the reason behind slowness in other places of the UI.

        Attachments

          Activity

            People

            Assignee:
            matthias.grabiak Matthias Grabiak
            Reporter:
            tim.vogt Tim Vogt
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Dates

              Created:
              Updated: