core(gather-runner): don't save trace on pass with pageLoadError #9198

brendankenny · 2019-06-13T20:55:23Z

#9176 turned a load that ends up on a Chrome interstitial (e.g. INSECURE_DOCUMENT_REQUEST) into a runtime error to let the user know that the page they were testing had a major problem and was never actually loaded for the test. Currently, however, all the artifacts in that errored pass are returned as errors except the devtoolsLog and trace, which end up in a kind of weird state where they're actually tracing the load of the interstitial, not any useful page.

If the load ends up just going to the interstitial, the metrics all end up as NO_NAVSTART errors, but if there was somehow a navigation (e.g. a URL redirects to a page with a bad cert), you'll end up with some nice metrics for the interstitial page.

Example report:

https://googlechrome.github.io/lighthouse/viewer/?gist=91d3c00c1186fdac9cc64f348dd14274

Regardless of seeing an audit error, in a pass with a pageLoadError, we shouldn't save the trace or devtoolsLog, just like we don't save the rest of the artifacts in that pass.

After this PR:

(also adds a test for the if-pageLoadError-stop-reloading behavior added in #8866/#9121)

patrickhulce · 2019-06-14T04:43:50Z

So I understand doing this from the perspective of the audit-visible artifacts and agree we prefer this report outcome, but from what I understand this eliminates the only current benefit of not having a fatal error, i.e. it's helpful right now when there's a runtimeError to be able to inspect the devtools log and trace for screenshots and network debugging data.

WDYT about trying to make the trace and devtools log still pass through to artifact saver/node result object?

brendankenny · 2019-06-14T16:53:01Z

Another benefit of all these changes is if there's a fatal error in the second or third pass (like http redirects to a bad cert url), there are still results from whatever was gathered in the earlier passes.

WDYT about trying to make the trace and devtools log still pass through to artifact saver/node result object?

Do you mean like just for debugging? Or would there be other uses you foresee?

Maybe we could mark them somehow as "here this is, just know it's a trace of a bad state and you'll have to manually inspect it if you want something out of it"? But if you mean it should also be in the node lhr that's somewhat different...

brendankenny · 2019-06-14T16:56:33Z

This also makes me wonder if pageloaderror always ending things is granular enough... An interstitial means the trace isn't even of your content so it's useless, but with PAGE_HUNG the trace is of your page, so some metrics might be extractable.

patrickhulce · 2019-06-14T17:37:30Z

Do you mean like just for debugging? Or would there be other uses you foresee?

Yeah, debugging. If we're eliminating literally every useful bit of information we managed to collect from the run in our return object, then I don't really see a reason to use runtimeError at all and we should just go back to fatally throwing again.

Maybe we could mark them somehow

That's what the existence of .runtimeError signals already, no?

But if you mean it should also be in the node lhr that's somewhat different...

I'm not sure what you mean here. We never return trace and devtools logs in the LHR and I'm certainly not saying we should start, but we need the artifacts in the node result object (of which the .lhr is a property) in order to write them to disk in CLI and let node consumers save anything for debugging.

I guess I don't understand why it's important to make sure they dont't have access to them. Given that most of these runtime errors are "hey there's something effed up with your page" I think there's value in giving the more sophisticated developer the data necessary to debug what's going wrong.

brendankenny · 2019-06-15T01:18:29Z

Yeah, debugging. If we're eliminating literally every useful bit of information we managed to collect from the run in our return object, then I don't really see a reason to use runtimeError at all and we should just go back to fatally throwing again.

Well we do have (possible) earlier passes that were ok and can return artifacts. This might become more important if we start opening up gathering options to plugins, which could add more passes, etc.

Maybe we could mark them somehow

That's what the existence of .runtimeError signals already, no?

yes, but there could be a good trace and a bad trace from two different passes, for instance, and it wouldn't be clear which one was the bad one from just having a runtimeError.

I guess I don't understand why it's important to make sure they don't have access to them

this came out of trying to define exactly what GatherRunner should be returning, so this is a good conversation to have :)

My thinking was basically:

no pageLoadError -> here's all the artifacts I collected about the page
pageLoadError -> here's all the known good artifacts I was able to collect about the page

we already give up any hope of collecting artifacts from a bad pass (and more recently we give up any hope of collecting artifacts from any later passes) so it made sense to me to discard the trace/devtoolsLog from a known bad pass as well.

It sounds like (correct me if I'm wrong) you're coming at it from the other way...we have this data, why not make it available (in a non-default way beyond asking folks to set a breakpoint) since there may be something valuable in there. I can definitely see the usefulness of that as well.

patrickhulce · 2019-06-15T14:27:02Z

good trace and a bad trace from two different passes

😉 To seriously address your plugin point though, I can see the value there. Perhaps we need a property in the pageLoadError that describes which pass has failed? Better yet, remove the .online check and annotate passes for "expected" failures?

we already give up any hope of collecting artifacts from a bad pass (and more recently we give up any hope of collecting artifacts from any later passes)

The distinction to me is that pretty much all of our gatherers are trying to learn something about the page after it's been loaded while trace and devtools logs are observations of whatever happened while we were trying to load it. When the page fails to load, there's no sense in trying to learn about the fake/failed page but observations of whatever led to the failure are quite valuable.

brendankenny · 2019-06-18T00:00:36Z

I think I agree with all of the above, and

Better yet, remove the .online check and annotate passes for "expected" failures?

is a really good idea.

What do we want Artifacts to look like, though? I still think we shouldn't be mixing known-good and known-bad traces in a single object (however theoretical this situation still is :).

If the only use case we have for keeping bad traces around is for debugging, we shouldn't make the Artifacts interface more complicated (e.g. checking a bad bit or an LHR field saying which pass(es) failed before accessing a trace) for the usual case to accommodate the (very) unusual case.

We could do something stupidly simple like renaming the trace/devtoolsLog property name from traces[passName] to something like traces[`bad-${passName}`]. Medium-awful solution :) but:

doesn't require any changes to existing runner/audits code but traces of errors won't be audited anymore
doesn't require a new top-level artifact
bad traces will still end up in the module/cli LHR and saved/loaded as normal with no changes to existing code

(or we could just drop them and require someone debugging to set a breakpoint but I could live with the above :)

patrickhulce · 2019-06-18T00:03:32Z

traces[`pageLoadError-${passName}`] actually sounds great to me, the CLI asset-saver logic/experience is identical and it clearly separates them from normal artifacts. Huzzah! :)

patrickhulce

LGTM!

would prefer some of these changes, but I'll live either way :)

patrickhulce · 2019-06-18T02:22:57Z

lighthouse-cli/test/smokehouse/seo/expectations.js

@@ -216,8 +216,7 @@ module.exports = [
      },
      audits: {
        'http-status-code': {
-          score: 0,
-          displayValue: '403',
+          score: null,


seems like this audit is useless at this point now, candidate for deletion?

seems like this audit is useless at this point now, candidate for deletion?

I'd say yes (are there cases when non 200 status codes shouldn't be a runtimeError?)

patrickhulce · 2019-06-18T02:25:34Z

lighthouse-core/gather/gather-runner.js

+   * @param {LH.Gatherer.LoadData} loadData
+   * @param {string} passName
+   */
+  static _saveLoadData(passContext, loadData, passName) {


save makes me think of like a disk write for some reason, WDYT about

addLoadDataToBaseArtifacts
stashLoadDataOnBaseArtifacts
saveLoadDataToBaseArtifacts
augmentBaseArtifactsWithLoadData

_addLoadDataToBaseArtifacts 👍

patrickhulce · 2019-06-18T02:32:11Z

lighthouse-core/test/gather/gather-runner-test.js

+      // Loads the page successfully in the first pass, fails with NO_FCP in the second.
+      async gotoURL(url) {
+        if (url.includes('blank')) return null;
+        if (firstLoad) {


how do you feel about jest.fn()?

IMO, it would be a bit easier to read our mock setups if we constructed them consistently with the creators. Right now we kinda have to figure out each situation on a case-by-case basis

jest.fn() .mockResolvedValueOnce(requestedUrl) .mockRejectedValueOnce(new LHError(LHError.errors.NO_FCP));

could go whole hog if we wanted

const gotoBlankFn = jest.fn().mockResolvedValue(null); const gotoPageURLFn = jest.fn() .mockResolvedValueOnce(requestedUrl) .mockRejectedValueOnce(new LHError(LHError.errors.NO_FCP)); const gotoURL = url => url.includes('blank') ? gotoBlankFn() : gotoPageUrlFn(); const driver = Object.assign({}, fakeDriver, {gotoURL});

how do you feel about jest.fn()?

I like it sometimes and have no problem with it, but in this case no matter what I try the readability comes out slightly worse (the url.includes('blank') really gums up the works). I think it reads ok for this one.

core(gather-runner): don't save trace on pass with pageLoadError

6e10f67

brendankenny requested review from patrickhulce and paulirish as code owners June 13, 2019 20:55

googlebot added the cla: yes label Jun 13, 2019

update smoke test

50b7511

vercel bot deployed to staging June 13, 2019 21:46 View deployment

pageLoadError-

d25ad04

vercel bot deployed to staging June 18, 2019 01:11 View deployment

patrickhulce approved these changes Jun 18, 2019

View reviewed changes

feedback

645f427

vercel bot deployed to staging June 18, 2019 23:12 View deployment

brendankenny merged commit 016085c into master Jun 19, 2019

brendankenny deleted the notraceonerror branch June 19, 2019 19:52

snyk-bot mentioned this pull request Mar 21, 2020

[Snyk] Upgrade lighthouse from 5.1.0 to 5.6.0 godaddy/lighthouse4u#13

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core(gather-runner): don't save trace on pass with pageLoadError #9198

core(gather-runner): don't save trace on pass with pageLoadError #9198

core(gather-runner): don't save trace on pass with pageLoadError #9198

core(gather-runner): don't save trace on pass with pageLoadError #9198

Conversation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment