[go: nahoru, domu]

Page MenuHomePhabricator

Reduce relying on database locks
Open, MediumPublic

Description

Around 19% of time spent in write queries is being spent locking or waiting for locking on primary database. This is a scalability bottleneck (we can buy more replicas, we can't buy more primary databases for a section),

grafik.png (501×1 px, 151 KB)

Maybe these traces can be adjusted? I could think of multiple ways:

  • Reduce the timeout, it's 15 seconds in some cases
  • Build and use a dedicated lock manager? We have one for files but I don't know how it's used post redis.
  • Use PoolCounter instead.
  • Use x2/main stash in some cases

Event Timeline

To spare others the effort of deciphering the charts: it looks like we're talking about the locking that happens in these 4 stack traces:

Wikimedia\Rdbms\Database::lock
Wikimedia\Rdbms\DBConnRef::__call
Wikimedia\Rdbms\DBConnRef::lock
MediaWiki\Storage\PageEditStash::getAndWaitForStashValue
MediaWiki\Storage\PageEditStash::checkCache
MediaWiki\Storage\DerivedPageDataUpdater::prepareContent
MediaWiki\Storage\PageUpdater::prepareUpdate
MediaWiki\EditPage\EditPage::internalAttemptSave
MediaWiki\EditPage\EditPage::attemptSave
MediaWiki\EditPage\EditPage::edit
EditAction::show
SubmitAction::show
MediaWiki\Actions\ActionEntryPoint::performAction
MediaWiki\Actions\ActionEntryPoint::performRequest
MediaWiki\Actions\ActionEntryPoint::execute
MediaWiki\MediaWikiEntryPoint::run
/srv/mediawiki/php-1.43.0-wmf.6/index.php
/srv/mediawiki/w/index.php
Wikimedia\Rdbms\Database::lock
Wikimedia\Rdbms\Database::getScopedLockAndFlush
Wikimedia\Rdbms\DBConnRef::__call
Wikimedia\Rdbms\DBConnRef::getScopedLockAndFlush
CategoryMembershipChangeJob::run
MediaWiki\Extension\EventBus\JobExecutor::execute
/srv/mediawiki/rpc/RunSingleJob.php
Wikimedia\Rdbms\Database::lock
Wikimedia\Rdbms\Database::getScopedLockAndFlush
Wikimedia\Rdbms\DBConnRef::__call
Wikimedia\Rdbms\DBConnRef::getScopedLockAndFlush
MediaWiki\Deferred\LinksUpdate\LinksUpdate::acquirePageLock
MediaWiki\Deferred\LinksUpdate\LinksUpdate::doUpdate
MediaWiki\Deferred\DeferredUpdates::attemptUpdate
MediaWiki\Deferred\RefreshSecondaryDataUpdate::doUpdate
MediaWiki\Deferred\DeferredUpdates::attemptUpdate
MediaWiki\Storage\DerivedPageDataUpdater::doSecondaryDataUpdates
WikiPage::doSecondaryDataUpdates
RefreshLinksJob::runForTitle
RefreshLinksJob::run
MediaWiki\Extension\EventBus\JobExecutor::execute
/srv/mediawiki/rpc/RunSingleJob.php
Wikimedia\Rdbms\Database::lock
Wikimedia\Rdbms\Database::getScopedLockAndFlush
Wikimedia\Rdbms\DBConnRef::__call
Wikimedia\Rdbms\DBConnRef::getScopedLockAndFlush
MediaWiki\Deferred\LinksUpdate\LinksUpdate::acquirePageLock
RefreshLinksJob::runForTitle
RefreshLinksJob::run
MediaWiki\Extension\EventBus\JobExecutor::execute
/srv/mediawiki/rpc/RunSingleJob.php

So the relevant components are DerivedPageDataUpdater, CategoryMembershipChangeJob, RefreshLinksJob and LinksUpdate.

I'm not really familiar with any of them, but maybe this helps us find someone who is.

aaron triaged this task as Medium priority.Jun 13 2024, 3:44 PM
aaron updated the task description. (Show Details)