[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

connection pooling for reproxy #1434

Merged
merged 29 commits into from
Oct 24, 2017
Merged

connection pooling for reproxy #1434

merged 29 commits into from
Oct 24, 2017

Conversation

i110
Copy link
Contributor
@i110 i110 commented Sep 28, 2017

Make it possible to use keep-alive connections for reproxy or mruby's http_request method

  • fix overall design and interfaces
  • use O(1) way to lookup existing target -> fix later when actual problems emerge
  • add configuration nobs and parameters (capacity, timeout, etc)
  • address race condition issues around pool->targets modification
  • location_rewrite_url ?
  • add tests

Copy link
Member
@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! So you reused h2o_socketpool_target_t! That looks like a exellent decision!

I have left design comments. Please let me know what you think.

return NULL;
}

h2o_socketpool_target_t *h2o_socketpool_get_or_add_target(h2o_socketpool_t *pool, h2o_iovec_t host, uint16_t port, int is_ssl, h2o_url_t *url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current design seems to expose a function that searches for or adds a target (i.e. get_or_add_target) and passes the address of the target to the function that actually connects to the server.

The approach seem to have two issues:

  • once allocated, pool->targets becomes unexpandable (or shrinkable), unless the mutex is locked from before the call to get_or_add_target to after the call to connect_to_target. Otherwise, we would have a race condition.
  • a target entry would be created for a server that always sends connection: close, which is IMO a waste

So you might want to consider the following:

  • do not expose get_or_add_target
  • accept the (host, port, is_ssl) triplet as an argument to connect_to_target
  • the function checks if a matching target exists, and if a target with a pooled socket found, returns the socket
  • otherwise, the socket pool connects to a new socket (note that you do not need to ensure that a matching target exists)
  • in h2o_socketpool_return, look for a matching target, and if not found, create a new target. Add to the target the socket being returned. You might need to pass the triplet as arguments to the function.

By using this approach, all locking operations can be kept within the socketpool.c. Also, pool->targets becomes not only expandable but also shrinkable (i.e. targets without any poold sockets can be garbage-collected).

include/h2o.h Outdated
@@ -415,6 +415,9 @@ struct st_h2o_globalconf_t {
* a boolean flag if set to true, instructs the proxy to emit a via header
*/
unsigned emit_via_header : 1;

int use_dynamic_socketpool : 1;
h2o_socketpool_t *dynamic_socketpool;
Copy link
Member
@kazuho kazuho Sep 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that there is no reason to delay the instantiation of the pool, now that it can hold any number of targets. Let's have it initialized in src/main.c.

@i110 i110 changed the title [WIP] Dynamic socketpool Dynamic socketpool Oct 4, 2017
@i110
Copy link
Contributor Author
i110 commented Oct 4, 2017

@kazuho I addressed the issues you pointed, and now this PR contains certain design changes around socketpool and http1client.

  • two types of socketpool
    • static socketpool - it's initialized with h2o_url_t's, and converts those into h2o_socketpool_target_t internally. No more changes of targets are permitted after once initialized
    • dynamic socketpool - it can be dynamically added new targets after initialization
  • http1client always (even when keepalive is disabled) connect to the peer using socketpool. if the socketpool argument is NULL, it uses default socketpool. By doing this, the implementation becomes very simple
  • the location_rewrite_url argument of h2o_socketpool_connect function is removed because (as far as I checked) it always equals the origin of selected target.

@i110
Copy link
Contributor Author
i110 commented Oct 4, 2017

cc: @deweerdt @zlm2012

@i110 i110 changed the title Dynamic socketpool [WIP] Dynamic socketpool Oct 4, 2017
@i110 i110 changed the title [WIP] Dynamic socketpool Dynamic socketpool Oct 5, 2017
/**
* initializes a socket loop
* initializes a static socket pool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using words: specific and global instead of (static and dynamic)?

Copy link
Member
@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes.

I have left my ideas. Please let me make further suggestsons, I am sorry to say this but I have not yet spent enough time going through the PR.

lib/core/proxy.c Outdated
h2o_url_init(&upstream, req->scheme, req->authority, h2o_iovec_init(H2O_STRLIT("/")));
}

h2o_http1client_connect(&self->client, self, client_ctx, sockpool, &upstream, on_connect, te_chunked);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the PROXY protocol is being used (i.e. when overrides->use_proxy_protocol is set), the client needs to establish a new connection even when there is a pooled connection to the peer, since the header (as defined in https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt) needs to be sent at the beginning of the connection.

However, it seems that h2o_http1client_connect do not provide an interface to enforce estabilishing a new connection. In other words, there is a chance that we would use a pool connection here.

OTOH, I can understand that the probability of seeing such issue is rare; it would only happen if the same destination identified by its host:port is accessed in both ways (i.e. in one path with use_proxy_protocol set and in the other path without).

So I do not think that it is absolutely necessary to fix the issue.

However, I would appreciate it if you could at least leave a comment here that such possibility exists, possibly by copy-pasting this comment to the source code.

@@ -79,6 +79,46 @@ static int on_config_send_delegated_uri(h2o_configurator_command_t *cmd, h2o_con
return 0;
}

static h2o_iovec_t create_authority_by_hostport(h2o_mem_pool_t *pool, const h2o_url_scheme_t *scheme, h2o_iovec_t host,
uint16_t port)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding functions to url.h / url.c? It could be something like h2o_url_init_with_hostport and h2o_url_init_with_sunpath.

@@ -37,6 +37,7 @@ struct pool_entry_t {
h2o_socket_export_t sockinfo;
h2o_socketpool_target_t *target;
h2o_linklist_t link;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would appreciate it if you could rename link to all_link so that the roles of the two links (the other is target_link) become clearer.

int *tried;
size_t try_count;
} lb;
};
Copy link
Member
@kazuho kazuho Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be changed to something like:

h2o_multithread_receiver_t *getaddr_receiver;
size_t selected_target;
struct {
    int *tried;
    size_t try_count;
} lb;

What I am suggesting here is that you only need to retain the selected index that designates the target within pool->targets. I believe that not using unions would reduce the impact on the code-base.

Note that the proposed struct removes lb.targets. I believe that it is possible; please see 5780763.

@zlm2012
Copy link
Contributor
zlm2012 commented Oct 11, 2017

@i110

the location_rewrite_url argument of h2o_socketpool_connect function is removed because (as far as I checked) it always equals the origin of selected target.

Sure. That is the exact meaning of that argument. I just didn't come out better naming of that argument and it was meant that location_rewrite_url should be rewritten if it appeared in the Location header responded from upstream.

And sorry for the delay, just come back from vacation :-)

@i110
Copy link
Contributor Author
i110 commented Oct 13, 2017

@kazuho addressed issues you pointed.

Copy link
Member
@kazuho kazuho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have left my comments.

While I believe that it might be possible to further adjust the design (to minimize the changes), I think the code is good enough that it can be merged and that we should merge this now, once after the issues left below are fixed.

src/main.c Outdated
{
h2o_socketpool_t *sockpool = h2o_mem_alloc(sizeof(*sockpool));
h2o_socketpool_init_global(sockpool, SIZE_MAX /* FIXME */);
h2o_socketpool_set_default_socketpool(sockpool);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we define the global socket pool as a global property of h2o_globalconf_t?

For libh2o, it would be beneficial to be able to release all sockets when a global configuration gets disposed (current approach does not do that). I believe that the code would become simpler, since we can eliminate the set_default_sockpool (and the get) function.

entry = H2O_STRUCT_FROM_MEMBER(struct pool_entry_t, link, pool->_shared.sockets.next);
h2o_linklist_unlink(&entry->link);

/* TODO is this needed in this critical section? */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. I think that you should move the call to lookup_target outside of the critical section.

@i110
Copy link
Contributor Author
i110 commented Oct 20, 2017

@kazuho done!

@kazuho
Copy link
Member
kazuho commented Oct 23, 2017

@i110 With my four commits (listed above) I think the PR is ready for merge. Please let me know what you think.

@i110
Copy link
Contributor Author
i110 commented Oct 23, 2017

@kazuho the code gets cleaner, thanks!

@kazuho kazuho merged commit e3119be into master Oct 24, 2017
kazuho added a commit that referenced this pull request Oct 24, 2017
@kazuho kazuho changed the title Dynamic socketpool connection pooling for reproxy Jan 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants