[go: nahoru, domu]

Page MenuHomePhabricator

Cleanup revision table schema
Open, LowPublic

Description

  1. ALTERs to run: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/924468/9/maintenance/archives/patch-revision-cleanup.sql
  2. Where to run those changes: all
  3. When to run those changes: When DBA has time
  4. If the schema change is backwards compatible: Yes.
  5. If the schema change has been tested already on some of the test/beta wikis: Is stable on betacluster
  6. If the data should be made available on the labs replicas and/or dumps: No changes needed

Progress

  • s1 (takes around 26h)
    • eqiad
    • codfw
  • s2
    • eqiad
    • codfw
  • s3
    • eqiad
    • codfw
  • s4 (takes around 14h)
    • eqiad
    • codfw
  • s5
    • eqiad
    • codfw
  • s6
    • eqiad
    • codfw
  • s7
    • eqiad
    • codfw
  • s8 (takes around 48h)
    • eqiad
    • codfw (running)
  • labtestwiki

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Mentioned in SAL (#wikimedia-operations) [2024-08-10T08:55:06Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1234 (T367856)', diff saved to https://phabricator.wikimedia.org/P67263 and previous config saved to /var/cache/conftool/dbconfig/20240810-085505-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-10T08:55:28Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1235 (T367856)', diff saved to https://phabricator.wikimedia.org/P67264 and previous config saved to /var/cache/conftool/dbconfig/20240810-085527-marostegui.json

@Liz. I know you use a couple of database reports to do some important daily tasks on enwiki. What are the links to those reports please?

If I understand it correctly, there are basically 2 sanitized replicas (analytics and web), and only 1 of them (analytics) currently has replag. Perhaps someone can generate a one-time report for you using the one without replag (web), so that you can start catching up on your daily tasks.

@Novem_Linguae

Well, I have three sources for reports that I use as an administrator. One is bot reports like from AnomieBOT III. Then I run some queries on Quarry. A few database reports I make use of are Wikipedia:Database reports/Orphaned talk pages and Wikipedia:Database reports/Empty categories. You can find the list at Wikipedia:Database reports.

But I haven't been able to receive up-to-date reports using any of these methods since a week ago, August 3rd. That was, apparently, 180 hours ago. So, I assume there will be quite a backlog of errors to fix when this replag ends.

Hopefully, this will be soon but I don't see any updates about it.

At the top of this page it says "s1 (takes around 26h)". Is that referring to this activity that has taken 183 hours so far? If so, I presume someone will change that note quite significantly once this is over, so that we know for next time.

I've been looking at that note for 7 days now. Hopefully, there won't be a "next time". Or is that wishful thinking?

At the top of this page it says "s1 (takes around 26h)". Is that referring to this activity that has taken 183 hours so far? If so, I presume someone will change that note quite significantly once this is over, so that we know for next time.

The 26h are quite accurate for production replicas, but do not work out for cloud because of two reasons.

  1. The change has to go down the chain from prod replica to sanitarium master to cloud replica, already tripling the overall time it takes.
  2. Applying the change to the sanitarium master and to the cloud replica takes even longer because they are not beeinf depooled and are under full read load. Also they cover multiple shards, so while they do not get any writes for s1, they are under full write load for the other shards.

I've been looking at that note for 7 days now. Hopefully, there won't be a "next time". Or is that wishful thinking?

Once this change is applied, it will not happen again. Although in general there will of course be other db maintenance in the future.

Mentioned in SAL (#wikimedia-operations) [2024-08-11T07:13:18Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367856)', diff saved to https://phabricator.wikimedia.org/P67266 and previous config saved to /var/cache/conftool/dbconfig/20240811-071318-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-11T07:58:40Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1235 (T367856)', diff saved to https://phabricator.wikimedia.org/P67269 and previous config saved to /var/cache/conftool/dbconfig/20240811-075839-marostegui.json

If one is pessimistic, then the enwiki.analytics.db.svc.wikimedia.cloud replag will continue to increase for all eternity. This would mean that the replag would increase to 9,000 hours on August 13, 2025, if this task is still not completed by then. Otherwise, if one is optimistic, then the replag will eventually go back to zero.

I will be the first to say most of the internal dealings with Quarry are beyond my understanding. Could it be possible that some really big jobs on 'enwiki' that started before the maintainence are still running and are stuck in a loop causing this? This could be really far off, so I apologize if it is a dumb question. Good luck trying to fix it.

If one is pessimistic, then the enwiki.analytics.db.svc.wikimedia.cloud replag will continue to increase for all eternity. This would mean that the replag would increase to 9,000 hours on August 13, 2025, if this task is still not completed by then. Otherwise, if one is optimistic, then the replag will eventually go back to zero.

There is no reason to be pessimistic.

I will be the first to say most of the internal dealings with Quarry are beyond my understanding. Could it be possible that some really big jobs on 'enwiki' that started before the maintainence are still running and are stuck in a loop causing this? This could be really far off, so I apologize if it is a dumb question. Good luck trying to fix it.

There's nothing to fix really. It's normal maintenance and the lag will quickly recover once it completes.

But, the million dollar question is, when will it ever be completed? Obviously not in 26 hours since we are currently at 209 hours. I know that whenever the lag ends, the system will catch up, it's just that it's been 8+ days now. I don't remember it ever taking more than 3 or 4 days or replag. But then I've only recently been paying attention.

But, seriously, any information you can share is appreciated. Part of the frustration is not knowing what is going on.

I checked the wikireplica slave stauts. For clouddb1013 it's:

Slave_SQL_Running_State: copy to tmp table

and for clouddb1017:

Slave_SQL_Running_State: copy to tmp table

So for both it works just fine. It's taking time. Nothing is broken. The reason it's taking longer is that cloud replicas hold more than one section so they get a lot of write I/O (from replication of other sections) and generally slower to copy a 262GB table. Depooling won't help much but also I'm sure it'll finish soon (very likely less than a day).

some other data point, the temp table is at 47GB:

root@clouddb1017:/srv/sqldata.s1/enwiki# ls -Ssh | head
total 1.1T
262G revision.ibd
 97G pagelinks.ibd
 90G templatelinks.ibd
 67G slots.ibd
 66G content.ibd
 47G #sql-alter-1bef-a.ibd

There is no way around this. Sorry.

@Ladsgroup, am I right in thinking that once the copy to the tmp table is done, the tmp table becomes the normal table, and it'll start catching up? And if so, can you re-run that ls command to see how far along it is? The "likely less than a day" was apparently wrong, so it'd be good if it was possible to get some kind of estimate. Is it hours/days/weeks/months?

@Ladsgroup, am I right in thinking that once the copy to the tmp table is done, the tmp table becomes the normal table, and it'll start catching up?

Yes, that's basically how most alters in mysql work.

And if so, can you re-run that ls command to see how far along it is? The "likely less than a day" was apparently wrong, so it'd be good if it was possible to get some kind of estimate. Is it hours/days/weeks/months?

root@clouddb1017:/srv/sqldata.s1/enwiki# ls -Ssh | head
total 1.1T
262G revision.ibd
 97G pagelinks.ibd
 90G templatelinks.ibd
 85G #sql-alter-1bef-a.ibd
 67G slots.ibd

This is making progress but extremely slowly. I suggest depooling clouddb1017 now, that'd redirect all the traffic to web which is caught up and it also makes clouddb1017 move forward faster.

I just depooled it. Everything is now using the web database for now (clouddb1013) that means they will have a lower timeout for a while but better than getting 9 day old data.

@Ladsgroup: thanks, I was planning to do the same if the lag kept on increasing.

The temp table is now at 262G so hopefully the lag on clouddb1017 should start decreasing soon:

root@clouddb1017:/srv/sqldata.s1/enwiki# ls -Ssh | head
total 1.1T
262G revision.ibd

I was wrong: the temp table is only at 87G:

root@clouddb1017:/srv/sqldata.s1/enwiki# ls -Ssh | head
total 1.1T
262G revision.ibd
 97G pagelinks.ibd
 90G templatelinks.ibd
 87G #sql-alter-1bef-a.ibd

I have increased the timeout on clouddb1013 to 10800, I will check if clouddb1013 can handle the load.

fnegri@clouddb1013:~$ sudo vi /etc/default/wmf-pt-kill
fnegri@clouddb1013:~$ sudo systemctl restart wmf-pt-kill@s1
fnegri@clouddb1013:~$ sudo ps ax |grep kill
   8699 ?        Ss     7:18 perl /usr/bin/wmf-pt-kill --daemon --print --kill --victims all --interval 10 --busy-time 300 --match-command Query|Execute --match-user ^[spu][0-9] --log /var/log/wmf-pt-kill/wmf-pt-kill-s3.log -S /var/run/mysqld/mysqld.s3.sock F=/dev/null
1037863 ?        Ss     0:00 perl /usr/bin/wmf-pt-kill --daemon --print --kill --victims all --interval 10 --busy-time 10800 --match-command Query|Execute --match-user ^[spu][0-9] --log /var/log/wmf-pt-kill/wmf-pt-kill-s1.log -S /var/run/mysqld/mysqld.s1.sock F=/dev/null

Could it be possible that some really big jobs on 'enwiki' that started before the maintainence are still running and are stuck in a loop causing this?

@RedDirector No, because we have a script that kills any queries that take more than X to complete. X is set to 5 minutes for "web" replica hosts and to 3 hours for "analytics" replica hosts.

I don't remember it ever taking more than 3 or 4 days or replag.

@Liz It's definitely unusual for replag to grow this much. I've spent many hours trying to figure out is anything is wrong (see T367778), but right now I didn't find any clear answer.

Thanks for all the work that has been done on the maintenance.

Yes, looks like the long wait is over. Thank you for helping the servers/system run smoothly.

I suggest depooling clouddb1017 now, that'd redirect all the traffic to web which is caught up and it also makes clouddb1017 move forward faster.

clouddb1017 is indeed moving faster, I can see increased activity in IOPS graphs after the depool, and the size of the temp table went from 87G to 106G in about 5 hours (3.8G/hour). At this rate it should take 41 more hours to complete the transaction.

I have increased the timeout on clouddb1013 to 10800, I will check if clouddb1013 can handle the load.

clouddb1013 is holding the load so far without showing any replication lag. Some "web" queries might be slower because of the additional "analytics" queries being sent to the same host.

First of all, apologies for the delay this is causing. However, this is coming from a production maintenance and there's not much else we can do. We are altering one of the biggest production tables and that's sort of expected to happen.

I would like to also mention that there're several things that are playing a role here. The first one is as mentioned, the size of this table. Secondly, this change comes through replication which means it needs to wait for the upstream host to fininish.
It takes around 26-30h in production, with a innodb buffer pool size of around 370GB, which makes everything a bit faster. This needs to go under sanitarium, which has a innodb buffer pool size of only 48GB, which means there's more time expected there. The actual change started to run on clouddb1017 6 days ago, which means it took almost 3 days to arrive to it (so production+sanitarium alter already made 3 days of lag).

Then, it needs to go through clouddb* hosts, which have around 250GB pool size, yet, smaller than production. These hosts are also having lots of traffic which mess up with the cache (in production we depool the host, so there is no "fighting" for the warmness of the table. So that for sure also plays a role.

Finally, we also need to keep in mind that this wiki replicas infrastructure was setup almost 6-7 years ago, and the traffic and load has grown a lot (which is great news). We will be seening more of this performance issues in the future if we keep with the current infra and hosts. There is not much we can do there.
When the service was launched, it was said it is a best effort service, we are doing our best to keep it as in sync (in terms of lag) with production, but it is never guaranteed that it will be always in syncs. Lag can happen and will happen as we keep growing and doing big maintenances in production (which comes first).

Thankfully, for the last 6-7 years, this service has been working almost perfectly and the lag was never more than a few seconds, however, lately, we are seen more and more, which makes me believe we are starting to hit the performance thresholds, which is not surprising at this point.

As I said, apologies for the lag.
The change is still going through but I believe it will finish soon:

root@clouddb1017:/srv/sqldata.s1/enwiki# ls -lh | grep "#"
-rw-rw---- 1 mysql mysql 3.7K Aug  6 15:41 #sql-alter-1bef-a.frm
-rw-rw---- 1 mysql mysql 186G Aug 13 07:08 #sql-alter-1bef-a.ibd

And this is a finished table:

root@db1154:/srv/sqldata.s1/enwiki# ls -lh revision.ibd
-rw-rw---- 1 mysql mysql 195G Aug 13 07:09 revision.ibd

So a few more G to go and the alter will be done. The lag is currently around 9 days, so after the change is done, it will start catching up and if the host is depooled that will help to make it faster.

The ALTER finished on clouddb1017 and the host is catching up at a rate that goes 1:4 (seconds:seconds) or 1:5, depending on the transaction.

clouddb1017 is back in sync with no lag.

I repooled clouddb1017.

We should reduce the timeout on clouddb1013 if it's not done.

@Ladsgroup I forgot about it :( Thanks for the reminder, I've reset it to the default value.

Mentioned in SAL (#wikimedia-operations) [2024-08-15T09:25:03Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2152 (T367856)', diff saved to https://phabricator.wikimedia.org/P67334 and previous config saved to /var/cache/conftool/dbconfig/20240815-092502-marostegui.json

s8 revision table is even bigger than enwiki so this will take even longer. First I am going to be altering codfw so nothing to worry about in terms of clouddb* hosts.

Mentioned in SAL (#wikimedia-operations) [2024-08-17T09:53:21Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2154 (T367856)', diff saved to https://phabricator.wikimedia.org/P67351 and previous config saved to /var/cache/conftool/dbconfig/20240817-095320-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-19T12:31:19Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154 (T367856)', diff saved to https://phabricator.wikimedia.org/P67382 and previous config saved to /var/cache/conftool/dbconfig/20240819-123119-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-19T13:16:40Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2154 (T367856)', diff saved to https://phabricator.wikimedia.org/P67385 and previous config saved to /var/cache/conftool/dbconfig/20240819-131640-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-19T13:17:02Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2162 (T367856)', diff saved to https://phabricator.wikimedia.org/P67386 and previous config saved to /var/cache/conftool/dbconfig/20240819-131702-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-20T05:22:23Z] <marostegui> Deploy schema change on s1 eqiad old master db1184 dbmaint T367856

Running this schema change on the old enwiki master (db1184)

Mentioned in SAL (#wikimedia-operations) [2024-08-21T14:01:04Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2162 (T367856)', diff saved to https://phabricator.wikimedia.org/P67428 and previous config saved to /var/cache/conftool/dbconfig/20240821-140104-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-21T14:46:26Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2162 (T367856)', diff saved to https://phabricator.wikimedia.org/P67438 and previous config saved to /var/cache/conftool/dbconfig/20240821-144625-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-21T14:46:48Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2163 (T367856)', diff saved to https://phabricator.wikimedia.org/P67439 and previous config saved to /var/cache/conftool/dbconfig/20240821-144648-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-23T14:31:42Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2163 (T367856)', diff saved to https://phabricator.wikimedia.org/P67730 and previous config saved to /var/cache/conftool/dbconfig/20240823-143140-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-23T15:17:04Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2163 (T367856)', diff saved to https://phabricator.wikimedia.org/P67734 and previous config saved to /var/cache/conftool/dbconfig/20240823-151704-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-23T15:17:31Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2164 (T367856)', diff saved to https://phabricator.wikimedia.org/P67735 and previous config saved to /var/cache/conftool/dbconfig/20240823-151730-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-25T14:46:23Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2164 (T367856)', diff saved to https://phabricator.wikimedia.org/P67750 and previous config saved to /var/cache/conftool/dbconfig/20240825-144623-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-25T15:31:45Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2164 (T367856)', diff saved to https://phabricator.wikimedia.org/P67753 and previous config saved to /var/cache/conftool/dbconfig/20240825-153144-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-25T15:32:07Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2165 (T367856)', diff saved to https://phabricator.wikimedia.org/P67754 and previous config saved to /var/cache/conftool/dbconfig/20240825-153206-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-27T15:48:24Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2165 (T367856)', diff saved to https://phabricator.wikimedia.org/P67940 and previous config saved to /var/cache/conftool/dbconfig/20240827-154823-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-27T16:33:45Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2165 (T367856)', diff saved to https://phabricator.wikimedia.org/P67946 and previous config saved to /var/cache/conftool/dbconfig/20240827-163345-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-27T16:34:07Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2166 (T367856)', diff saved to https://phabricator.wikimedia.org/P67947 and previous config saved to /var/cache/conftool/dbconfig/20240827-163407-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-29T16:47:18Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2166 (T367856)', diff saved to https://phabricator.wikimedia.org/P68182 and previous config saved to /var/cache/conftool/dbconfig/20240829-164717-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-29T17:32:41Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2166 (T367856)', diff saved to https://phabricator.wikimedia.org/P68189 and previous config saved to /var/cache/conftool/dbconfig/20240829-173240-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-29T17:33:03Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2167 (T367856)', diff saved to https://phabricator.wikimedia.org/P68190 and previous config saved to /var/cache/conftool/dbconfig/20240829-173303-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-31T14:11:51Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167 (T367856)', diff saved to https://phabricator.wikimedia.org/P68483 and previous config saved to /var/cache/conftool/dbconfig/20240831-141150-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-31T14:57:12Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2167 (T367856)', diff saved to https://phabricator.wikimedia.org/P68490 and previous config saved to /var/cache/conftool/dbconfig/20240831-145712-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-08-31T14:57:34Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db2181 (T367856)', diff saved to https://phabricator.wikimedia.org/P68491 and previous config saved to /var/cache/conftool/dbconfig/20240831-145733-marostegui.json

Mentioned in SAL (#wikimedia-operations) [2024-09-02T16:28:36Z] <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2181 (T367856)', diff saved to https://phabricator.wikimedia.org/P68535 and previous config saved to /var/cache/conftool/dbconfig/20240902-162836-marostegui.json