[go: nahoru, domu]

Page MenuHomePhabricator

Use Envoy instead of LVS to route internal federation traffic for WDQS
Closed, InvalidPublic

Description

As discussed in T361950#9934152, we don't want to go through LVS for internal traffic. Solution proposed by @Vgutierrez is to use local envoy to do the routing / load balancing.

AC:

  • federation traffic in WDQS does not go through LVS

Event Timeline

Discussed this at pairing with @RKemper today, we were curious how to implement in envoy.

I'm far from being an envoy expert, but it looks like we could use envoy as a load balancer by defining a static envoy config similar to cluster.yaml.erb and listerner.yaml.erb . So envoy definitely supports this, but it's not a feature that's in use anywhere at WMF as far as I can tell. Tagging serviceops to comment on whether or not configuring this way is recommended.

ServiceOps, most of the context you need is in the parent ticket. Thanks for looking!

In the below post I'm trying to aggregate info from the previous phab ticket and in-meeting discussions to give service-ops a jumping-off point to start from

(Some of these details might be wrong, but I think the main points are correct)

Request Routing

Previous way: user submits a query at query.wikidata.org-> ATS -> LVS -> nginx -> blazegraph

New way: (if a federated query) user submits a query at query-(main|scholarly).wikidata.org -> ATS -> LVS -> nginx -> blazegraph -> http client inside blazegraph reaches out to endpoint specified in federation request

(Non-federated queries are equivalent to the previous non-graph-split way of doing things so are irrelevant for our purposes here)

Throttling

Previous way: throttling done via token-bucket algorithm on 2-tuple of (IP addr, user agent)

New way: Still same token-bucket algorithm, however we need to identify internal federation queries and disable throttling in that use case (throttling will still occur at the external level; ie the first wdqs-(main|scholarly) endpoint contacted by the user). Current proposal is to use envoy for internal (i.e. emanating from blazegraph http client) federation queries, but that raises some questions, most pressingly: is there a mechanism whereby we can query etcd state so that hosts depooled from pybal will similarly be excluded from the envoy pool?

Questions
  • Is it feasible to have envoy query etcd state in order to decide pool members? (i.e. we would want to maintain the invariant that if a host is depooled in pybal it won't be present in envoy's pool). My present understanding is that sophroid (https://gitlab.wikimedia.org/repos/sre/sophroid) is intended to support this but isn't production-ready
  • If it's not feasible, are there any recommendations about how to best achieve our goals?

Note that the throttling question is orthogonal to the LVS vs Envoy question. As long as we are able to identify which requests are coming from the outside (internet) vs internal, we can put in place throttling exceptions.

It is not clear to me what solutions we have in place to route / load balance internal traffic. If LVS isn't the solution to load balance internal traffic, we need another solution. I don't think that Data-Platform-SRE is the right team to put in place a new load balancing solution.

Whatever solution we use for internal load balancing should provide at least:

  • easy way to pool / depool individual nodes, manually or automatically based on health check
  • redundancy: a failure of on of the load balancing node should not result in a complete loss of the service behind it
  • integrated operation for internal and external traffic routing: in this specific case, the Blazegraph pools are going to serve both internal and external traffic, we need to ensure that depooling a node will depool it for both internal and external traffic. That could be achieve either by having a shared control plane or by having a single LB routing both internal and external traffic.
Gehel triaged this task as High priority.Jul 9 2024, 8:02 AM
Gehel moved this task from Incoming to Scratch on the Data-Platform-SRE board.

Note that we should not have traffic loops, at least not in the sense that I understand loops. There are multiple blazegraph pools with different datasets. Each pool will be able to federate with another pool, but should never federate with itself. As an example:

Queries to wdqs-main, federated with wdqs-scholarly:
query.wikidata.org-> ATS -> LVS -> nginx -> blazegraph (main) -> LVS -> nginx -> blazegraph (scholarly)

And a graph to hopefully make all this more clear:

wdqs-graph-split-traffic.png (695×768 px, 54 KB)

Gehel claimed this task.
Gehel moved this task from Backlog to Done on the Data-Platform-SRE (2024.07.08 - 2024.07.28) board.
Gehel added a subscriber: Joe.

After discussion with @Vgutierrez and @Joe, given that we have separate LVS pools for the different blazegraph pools, we don't create traffic loops and we should be good using LVS.

I'm closing this as invalid, feel free to re-open if I misunderstood something.