As of the completion of
OPENIDM-2256, we have the generic ability to define the subset of data we want back from a given repo-based resource. This dramatically-reduces the need for creating new queryId entries in each of our repo files, and that is hugely valuable. However, for it to address the remaining needs for querying the repo, there also needs to be the ability to sort and page through these results.
The ForgeRock REST API for queries defines _pageSize as the standard parameter to be used to indicate paging. If specified, the code for handling queries against our repos should create appropriate paging commands when building the SQL statement, so that (for example with MySQL) a LIMIT clause would be included using the value from _pageSize.
When paging, there are two options for dealing the 'remaining' results (those that may be available after the first page):
When specified, this value should be used as the starting point for the next set of values. It is based on the last value from the previous result set; the next page will be purely defined based on those values which are "greater" than this "cookie" value. For this reason, it is very important that the result set be ordered - otherwise there would be no meaning implied by fetching the next page of data that is "greater" than some other value. The meaning of "greater" in this context is purely based on the ordering used in the set; "greater" values come after "lesser" values. When generating SQL for this option, the value for the "cookie" would be included in the WHERE clause.
This is a numeric value offset mechanism. Instead of basing the next page on whichever records have values greater than the previous page, it is instead based purely on the position of the records in the complete result set. Ordering is also important in this option, however it is not enough to ensure consistency because the records in the complete result set may have changed from one page request to the next.
In both of the above options, ordering is important; there are different use cases for each, however.
The _pagedResultsOffset option will be more desirable for direct user interaction, since it could support relatively-reliable paging when using arbitrary columns to sort by. In some circumstances, changing records could be "missed" as the user pages through the set, but for this use-case that is less important.
The _pagedResultsCookie option will be necessary to ensure consistency in the face of changing data, to ensure that records are not missed (say due to value in a field that is used to define a page changing, resulting in it being pushed into a previously-processed page). For this to be possible, care will need to be taken to ensure that the fields sorted upon are either unlikely to change (such as a primary key) or that if they change, they will be pushed onto the last page as a result of that change (such as when using a last_changed timestamp).
To specify the order of the result set, ForgeRock REST identifies a standard parameter called _sortKeys. Just as with the above paging parameters, we need to use this parameter when building our SQL statements, as input to the "ORDER BY" clause.
Very similar work to all of this has been done in Groovy, in samples/sample3/tools/SearchScript.groovy. That could be a good resource to refer to for implementation ideas.
Note that different database servers (PostgreSQL, MS SQL, MySQL, Oracle, etc...) do limits and offsets differently. Refer to the existing paging and sorting behavior that is implemented in each server's repo config as part of the named queries.