[go: nahoru, domu]

How to repro bot failures

If you‘re looking for repro a CI/CQ compile or test bot failures locally, then this is the right doc. This doc tends to getting you to the exact same environment as bot for repro. Most likely you don’t need to follow the exact steps here. But if you have a hard time for repro, then it's good to check each steps. Also keep in mind that even you have the exact same environment, some failures are not reproduceable locally.

When trying to repro and fix a bot failure, the easier approach is tryjobs, or debug with swarming. This doc is more useful when it requires more local debugging and testing.

This doc assumes you're familiar with Chrome development on at least 1 platform.

Identify the platform

Usually this is easy by looking at the builder name. e.g. linux-rel means it's on linux platform.

Understand the platform system requirements

Usually you should be able to follow Chromium Get Code and understand the platform.

Prepare the compile and test devices

For compile devices, if you see a “compilator steps”(usually for CQ jobs),

compilator step

then click the “compilator build” and then bot on the right.

bot link

Otherwise just click bot on right.

For test devices, for most builders, click the shard and then click on bot.

test shard

The bot page would look like this

bot page

You would want to prepare the exact same environment when possible. e.g. many tests are using e2-standard-8 GCE bot, you will want to create a e2-standard-8 GCE VM. If you see python 3.8, then install python 3.8.

Checkout code

Follow the platform specific workflow to checkout the code.

Revision

On bot, you can see revision like this.

revision

First sync to that revision. For tryjobs, you also need to cherry-pick the changelist with correct patchset after sync to correct revision.

Note: Don't just download a patch from gerrit to repro tryjob failures. On bot, it will first checkout the branch HEAD, then patch the patchset. But on gerrit, what you see is your local revision when you upload with the patchset. On gerrit you see code at an older base revision.

Change gclient config

On bot, there is a step named ‘gclient config’.

gclient

Apply the same gclient config and then rerun ‘gclient sync’.

gn args

On bot, there is a step named ‘lookup GN args’.

gn args

Apply the same gn args locally. Note: most of the time, some path based flags can be omitted. Examples:

  • coverage_instrumentation_input_file

At the time of writing, some gn args can't be applied locally, like use_remoteexec.

Note: ChromeOS builders(or any platform that involves ‘import’ gn rules) may show complicated gn args. For these builders, you need to search the builder name in //tools/mb or //internal/tools/mb for internal builders to find out the gn args.

Compile

On bot, there is a step named ‘compile’. Go to ‘execution_details’ and compile all the targets listed in there.

Run test

Most of the time, on bot the compile device is not the same as the test device. But locally for desktop platforms, they are the same device and in the same //out folder. There is no easy way to mimic bot behavior. A common issue on bot is you see some missing files but can‘t repro locally. This is because on bot we calculate what files are needed for testing using build maps and copy them from compile machine to test machine. But locally you don’t have this step. Usually the fix is to add some ‘data’ or ‘data_deps’ in BUILD.gn.

Open the bot test result page

test page

You should see all the information you need including system environment and commandlines.

Note: Set system environment. Especially important for gtest GTEST_SHARD_INDEX GTEST_TOTAL_SHARDS.

Note: On bot, the CURRENT_DIR is in the //out dir. Say you compile test target in //out/Default, then enter //out/Default and then run tests.

Note: For gtest, if you're not using the same machine type, you might want to override some flags like --test-launcher-jobs.