Purging records from the replication changelog is unnecessarily expensive because we are using JE, which forces us to delete records one at a time. In addition the current implementation can only remove at most 5000 changes per second which is not sufficient to keep up with constant heavy load. Furthermore, the JE based implementation will be required to maintain BTree inner-nodes, run cleaner threads, store keys as well as values, etc, all of which consume resources.
Since a change log is a very special case DB which does not require much of the complexity provided by JE (random access, transaction support, etc), we should be able to roll our own. We have the following requirements:
- update: write a change to the head of the log
- purge: remove a block of changes from the tail of the log
- random seek: performed when finding the initial position within the log during recovery or changelog browsing
- sequential read: performed once positioned during recovery or browsing, and always in the direction of the head of the log
We should optimize for (1), (2), and (4): item (3) is performed less frequently and we can tolerate latencies of 100s of milli-seconds if needed.