As an admin I would like to run a single command and immediately know of any problems affecting the health of my DS service.
We could build a tool or a subcommand for this purpose. It would return 0 when there is no problem, or return a non zero exit code in case of a problem, dumping the problems found on stderr.
For example, it could report on
- change number computation blocked
- servers running low or on disk
- broken indexes
The tool will likely have to contact al the servers, and check on their alive/healthy endpoints (via LDAP or HTTP).