[OPENDJ-5682] Add a SharedConnectionPool to reduce number of connections Created: 09/Nov/18  Updated: 05/Nov/20  Resolved: 19/Mar/19

Status: Done
Project: OpenDJ
Component/s: core apis, proxy
Affects Version/s: 6.5.0
Fix Version/s: 7.0.0

Type: Improvement Priority: Major
Reporter: Yannick Lecaillez Assignee: Joseph de-Menditte
Resolution: Fixed Votes: 0
Labels: None

Attachments: PNG File searchrate_pool_size_10_throughput_comparison.png     File searchrate_pool_size_10_throughput_comparison.svg     PNG File searchrate_pool_size_15_throughput_comparison.png     File searchrate_pool_size_15_throughput_comparison.svg     PNG File searchrate_pool_size_20_throughput_comparison.png     File searchrate_pool_size_20_throughput_comparison.svg    
Issue Links:
Duplicate
duplicates OPENDJ-2861 Implement a "shared connection" conne... Done
Regression
caused OPENDJ-6121 OpenDJ QA: FakeServer and SDK no long... Done
Relates
relates to OPENDJ-5626 Investigate performance of authrate/s... Done
relates to OPENDJ-6107 performance regression with shared co... Done
is related to OPENDJ-7623 Remove deadlock-prone SharedConnectio... QA Backlog
Sub-Tasks:
Key
Summary
Type
Status
Assignee
OPENDJ-5773 QA Task Sub-task Closed Ondrej Fuchsik  
Epic Link: Common Repo
Story Points: 16
Dev Assignee: Joseph de-Menditte
QA Assignee: Ondrej Fuchsik

 Description   

Now that the Load-Balancing infrastructure in the SDK performs the balancing on a per-request basis, we're opening a connection for each requests and close it once the request is done.
To optimize things we already have a CachedConnectionPool which keep opened a set of connections which can be used to send these requests without having to create a connection for each requests.

Still, because of the connect -> request -> close programming model, we end to send only one request per connection. Sending multiple concurrent request to the same server would then require the same number of connection to that server.

This has some significant drawbacks:

  • A significant number of connections is required to be able to take advantage of the parallel processing done by the server. This significant number of connections creates processing overhead on both client and server-side.
  • Because each connection only have one request at a time, only small TCP packets are exchanged on the network.

To solve such problems we propose to create a SharedConnectionPool which will acts like a CachedConnectionPool with the major difference that connect() will not remove the connection from the pool. That is, a subsequent invocation of connect() could return the same connection instance multiple times.

It might be interesting to test different algorithm on how the connections get balanced and measure the impact (latency and throughput):

  • Round-robin
  • Least-request: the connection returned is the one having the least number of concurrent requests.
  • Leaky bucket: Maximize connection usage by returning the same connection until it reach a pre-configured number of concurrent requests.


 Comments   
Comment by Ludovic Poitou [ 09/Nov/18 ]

You might want to also evaluate if the new SharedConnectionPool is suitable for services that are authenticating clients using Bind, and sending requests for the different users (on behalf of).

Comment by Yannick Lecaillez [ 09/Nov/18 ]

I don't think it will be possible to forward Bind requests with such pool: Bind request needs an exclusive access to the underlying connection.
Sending requests on behalf of user could be done using Proxy-Authorization as it is done today.

Comment by Matthew Swift [ 10/Nov/18 ]

This RFE duplicates OPENDJ-2861.

Comment by Yannick Lecaillez [ 12/Nov/18 ]

Here a result of searchrate against DJ 7.0-SNAPSHOT with 128 requests in parallel,

Using 1 connection per request:

$ ./bin/searchrate -p 1389 -D "cn=directory manager" -w password -B 10    -F -c 128 -t 1 -b "uid=user.{},ou=people,dc=example,dc=com" -g "rand(0,100000)" "(objectclass=*)"
Warming up for 10 seconds...
--------------------------------------------------------------------------------------------
|     Throughput    |                 Response Time                |       Additional      | 
|    (ops/second)   |                (milliseconds)                |       Statistics      | 
|   recent  average |   recent  average    99.9%   99.99%  99.999% |  err/sec Entries/Srch | 
--------------------------------------------------------------------------------------------
|  80769.6  80769.6 |    1.581    1.581    10.88    60.82   206.57 |      0.0          1.0 | 
|  79886.4  80328.1 |    1.597    1.589    10.49    18.48   200.28 |      0.0          1.0 | 
|  79325.2  79993.8 |    1.609    1.596    10.03    16.25   192.94 |      0.0          1.0 | 
|  80111.6  80023.2 |    1.593    1.595     9.76    15.40   185.60 |      0.0          1.0 | 
|  80036.8  80026.0 |    1.595    1.595     9.63    15.14   180.36 |      0.0          1.0 | 

Using 4 connections and 32 concurrent requests per connection:

$ ./bin/searchrate -p 1389 -D "cn=directory manager" -w password -B 10    -F -c 4 -t 32 -b "uid=user.{},ou=people,dc=example,dc=com" -g "rand(0,100000)" "(objectclass=*)"
Warming up for 10 seconds...
--------------------------------------------------------------------------------------------
|     Throughput    |                 Response Time                |       Additional      | 
|    (ops/second)   |                (milliseconds)                |       Statistics      | 
|   recent  average |   recent  average    99.9%   99.99%  99.999% |  err/sec Entries/Srch | 
--------------------------------------------------------------------------------------------
| 100092.8 100092.8 |    1.275    1.275    14.94    23.59    38.54 |      0.0          1.0 | 
|  95406.6  97749.7 |    1.337    1.305    14.61    20.32    37.49 |      0.0          1.0 | 
| 101862.6  99120.7 |    1.252    1.287    14.61    18.87    36.44 |      0.0          1.0 | 
| 101006.6  99592.2 |    1.263    1.281    14.68    18.87    34.87 |      0.0          1.0 | 
|  99650.6  99603.8 |    1.279    1.280    14.55    18.48    34.34 |      0.0          1.0 | 

That's roughly 24% better throughput and latency.

Comment by Jean-Noël Rouvignac [ 13/Nov/18 ]

Because the SharedConnectionPool can only be used for stateless operations (and not binds), maybe should we name this StatelessConnectionPool?

Comment by Matthew Swift [ 14/Nov/18 ]

I think that this is a reasonable RFE. A shared/stateless connection pool would be useful elsewhere as well: Rest2Ldap (IDM), AM, DSML, PTA, as well as the proxy. However, care should be taken to only use this pool implementation for non-bind requests. In other words, most proxy/gateway applications will need need to use two pool implementations: the existing cached connection pool for authentication and shared connection pool for all other operations.

Having said that, it would be cool if we could have a means of relaxing the constraint on bind requests. For example, by having a control that indicates that the bind request should not change the connection state. This would allow multiple binds to be performed in parallel. Another option is to piggy-back bind-like behavior on top of compare requests, such that compare operations update the password policy state. A control for bind requests seems like the best fit though IMO.

Comment by Yannick Lecaillez [ 14/Nov/18 ]

I like the idea of the Control: iirc the list of supported Control of a server can be queried so that means we'll not have to add another setting.

Note that the fact that the pool cannot be shared for "standard" requests and bind request is already the case for all the already existing load-balancer implementations we have.
So maybe this is more a Shared/Stateless ConnectionLoadBalancer than a Pool ?

Comment by Matthew Swift [ 14/Nov/18 ]

That's right. The control I'm proposing would allow us to use the same connection pool for all requests, thereby reducing the total number of questions and their related keep-alive pings.

Comment by Yannick Lecaillez [ 14/Nov/18 ]

All but password-modify iirc ? Isn't there a problem with password modify and proxy-authz ? If i remember correctly it requires a connection which has been bound with the appropriate user.

It should be possible to do something about it but we would probably need one meta-request wrapping both the Bind + PasswordModify extending op.

Comment by Matthew Swift [ 14/Nov/18 ]

Ludo fixed proxying of password modify requests recently. See OPENDJ-4992.

Comment by Joseph de-Menditte [ 05/Dec/18 ]

Matthew Swift What's the plan for this feature? Do we go with the simple implementation suggested by Yannick Lecaillez (that means keeping 2 pools one for the bind requests, one for the other requests as it is implemented today), or do we want to implement the control in order to use one pool for all the requests?

Comment by Matthew Swift [ 06/Dec/18 ]

I would go with the simple implementation to begin with which requires two pools in the proxy. The control could be the subject of a separate RFE. In particular, the shared pool can be used immediately in AM.

Comment by Ondrej Fuchsik [ 12/Mar/19 ]

Matthew Swift I still see one PR open, but probably a PR to not merge. Can you confirm, please? Otherwise I think this can be close as all tasks are done.

Comment by Jean-Noël Rouvignac [ 12/Mar/19 ]

Ondrej Fuchsik yes you can ignore it.

Comment by Ondrej Fuchsik [ 19/Mar/19 ]

All sub tasks are finished.  There is perf issue linked to this improvement. 

Comment by Ondrej Fuchsik [ 19/Mar/19 ]

All sub-tasks done, so closing the issue.

Generated at Sun Nov 29 16:30:49 UTC 2020 using Jira 7.13.12#713012-sha1:6e07c38070d5191bbf7353952ed38f111754533a.