[go: nahoru, domu]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefetching Documents #6723

Open
jeffkaufman opened this issue May 27, 2021 · 15 comments
Open

Prefetching Documents #6723

jeffkaufman opened this issue May 27, 2021 · 15 comments

Comments

@jeffkaufman
Copy link

Historically, <link rel=prefetch href=url> would tell the browser to consider fetching url, and then if the browser needed url for anything it would be available. As browsers have changed how they fetch resources, however, this is no longer so simple. A browser with a partitioned cache typically has a different cache key for https://a.example when it's a top-level page vs an iframe. When it sees a prefetch, it needs to pick which key to use.

This means <link rel=prefetch href="https://a.example"> can support
<iframe src="https://a.example"> or window.location='https://a.example' but not both.

Here are two demonstration pages with <link rel=prefetch>:

Both prefetch, wait five seconds, then attempt to use the prefetched result. The first uses the prefetch to populate an iframe while the second uses it for navigation.

I tested on Chrome, Firefox, and Safari. (Safari normally doesn't respect link prefetch. This is with Develop > Experimental Features > LinkPrefetch on.) Here's whether the prefetch is used (no additional network request on the wire):

Browser iframe top navigation
Chrome double fetch double fetch
Firefox prefetch used double fetch
Safari double fetch prefetch used

If we include as=document, Chrome does use the prefetch while Safari and Firefox ignore the attribute and behave as before:

Browser iframe as=document top navigation as=document
Chrome double fetch prefetch used
Firefox prefetch used double fetch
Safari double fetch prefetch used

One way to fix this would be to allow sites to make their intention clear. The fetch spec has an iframe destination, so we could use that: as=iframe. Then we could spec as=document as being for top-level navigation only.

@domenic
Copy link
Member
domenic commented May 27, 2021

I think this issue would be best opened on the spec that defines prefetch, which is https://w3c.github.io/resource-hints/ ? Although we do hope to one day move that spec into HTML for better maintenance. /cc @domfarolino @yoavweiss

@jeffkaufman
Copy link
Author

Happy to move it there if you like, though it was @yoavweiss who suggested I post it here ;)

@domfarolino
Copy link
Member

Thanks for filing this and continuing the discussion Jeff. Yeah, things are a bit nastily spread out at the moment, so I am pretty agnostic as to where this discussion takes place. Since we eventually want to consolidate all of this in HTML, here seems as good as anywhere I suppose. I'd love to here @annevk's take on this, but it seems like a good idea to me.

Things can get a little tricky with the fact that the network partition key is a little under-defined. Right now your text makes the distinction between top-level and first-level-nested-iframe, and the network partition key has a top-level bit/key corresponding to the top-level site, however it has a currently implementation-defined second-key to represent partitioning at potentially arbitrary nesting levels. So it would be interesting to figure out exactly what behavior we want:

  1. <link rel=prefetch as=iframe> stores the resource in the cache partition nested under the top-level one
  2. <link rel=prefetch as=iframe> stores the resource in the cache partition nested under the current partition
    • This would allow us to support the case where we have a page structure like A(B(C)), and B prefetches as=iframe a resource for D.com and stores it in B's cache partition so that iframes under A(B) can use it, but no other iframes in this structure could

@jeffkaufman
Copy link
Author

Normal usage is <link rel=prefetch as=iframe href=X> and then <iframe src=X> both within the same context, right? It looks to me like both your (1) and (2) support this when at the top level context, but only (2) supports it regardless of context.

@domfarolino
Copy link
Member

That's right, both (1) and (2) support a top-level A.com prefetching B.com, to be used for an iframe that is nested under A.com.

And just to be clear, I am not recommending that we should consider supporting prefetching in one context at some level in the tree structure, and reuse by another context at another level in the tree structure. It's just that if the fetching-and-reusing context are nested at all, we begin relying on the implementation-defined parts of the network partition key, and therefore behavior would differ depending on whether or not a browser implements double-key caching, triple-key caching, and so on, compared to browsers that don't.

@annevk
Copy link
Member
annevk commented May 31, 2021

I would say that Safari's behavior is a bug and likely the result of the prefetch cache not being double-keyed. @youennf can confirm. That might be why it's disabled.

Chrome partitions by more than the top-level site and I would expect that's why you see what you see in Chrome.

I don't think it's safe to give control over the partition key. The whole reason we have that is to keep things isolated.

@othermaciej
Copy link
Collaborator
othermaciej commented Jan 20, 2022

It seems like a good idea to express whether a prefetch loaded document is intended to be loaded in a subframe of the current document or as a top-level navigation. Maybe we could just say documents should always be loaded as if top-level when prefetched and have the spec require that. I worry that this could be used for cross-site tracking though, if that top-level load gets cookies. A unique ID could be placed in the <link rel=prefetch as=document> link, and then if the document loads at top-level with cookies, it can link user identity across sites. On the other hand, if the load is done without cookies, then the load may be wasted if the user is actually logged in on the other site.

Separately: this as proposal doesn't really solve the case of sub resources that should be prefetched for a future site load. Should non-document resource types always load in the current cache partition? That makes prefetch useless for purposes of prewarming the next document load for sub resources and kind of redundant for preload for the same-document case.

@jeffkaufman
Copy link
Author

We could let people be explicit about what they want via an origin option. So: <link rel=prefetch as=script origin="https://site.example" href="https://cdn.example/some/script.js"> would be a hint to the browser that some page on https://site.example is going to want https://cdn.example/some/script.js.

This would mean we would not need as=document vs as=iframe: they would both just use document with different values for the origin attribute.

In the case where the origin attribute was not provided, we could just maintain current behavior.

@jeffkaufman
Copy link
Author

In cases where browsers care about nested origins, we could allow origin to be a list, so <link rel=prefetch as=script origin="https://site.example https://embed.example" href="https://cdn.example/some/script.js"> would be a hint that some page on https://site.example is going to embed some page on https://embed.example which will want https://cdn.example/some/script.js.

@annevk
Copy link
Member
annevk commented Jan 21, 2022

So essentially that would give an attacker the ability to fetch a resource with an arbitrary authority. Something that's going to be only possible through new windows, once browsers have broadly shipped state partitioning. The resulting resource is just cached, but on the surface it seems like that gives attackers at least one additional side channel over the ones they have already available to them.

@jeremyroman
Copy link
Collaborator

Ultimately writing into an arbitrary partition right away does seem like something modern browsers would prefer to avoid.

For main resources I've been thinking about prefetching without credentials and holding that separate from that site's real partition until the user actually navigates there. This at least means you can't write into arbitrary partitions without some user action that implicates that partition. (WICG/nav-speculation is where that work is happening)

The case of wanting to prefetch a subresource for a future navigation seems even trickier, at least if the declaration is coming from the previous page as opposed to, e.g., the main resource's Link rel=preload response headers. Even with an isolated cache (and the potentially redundant fetches that implies) it allows sending data in more ways than a plain URL navigation would. It certainly seems like a hard case to allow while preserving isolation between partitions.

@youennf
Copy link
youennf commented Jan 24, 2022

I would tend to restrict prefetch to top level navigation only and rely on preloads for all other resources for which there is a known keying context.

We probably need some experimentation with prefetching without credentials. @kinu, any insight?
Given the potential issues there, it might make sense to expose to the server that the request is a prefetch.
And maybe allow the response to influence UA processing of the response when navigation actually happens.

@yoavweiss
Copy link
Collaborator

^^ @nyaxt

@noamr
Copy link
Contributor
noamr commented Feb 12, 2022

Separately: this as proposal doesn't really solve the case of sub resources that should be prefetched for a future site load. Should non-document resource types always load in the current cache partition? That makes prefetch useless for purposes of prewarming the next document load for sub resources and kind of redundant for preload for the same-document case.

I think this is OK.
For prewarming the next cross-origin document's sub-resources it's better to iterate on no-state prefetch or the currently-being-spec'ed prerender specs rather than try to have prefetch handle all these cases. as is still useful for prewarming next navigations of the same origin.

For the same document, preload is indeed similar to prefetch and redundant in some cases, where the difference is in priority - preload now vs. prefetch when idle.

I like the as=iframe approach (for cross-origin), and to keep prefetches as applying to the origin of the link's document's context only. ((2) in this comment). I don't see a benefit in involving the top-level context here at all or having these prefetches have any cross-context influence.

@nyaxt
Copy link
Member
nyaxt commented Feb 15, 2022

^^ @nyaxt

Thanks. @kinu left the project, and @nhiroki and I are taking over.

I would tend to restrict prefetch to top level navigation only and rely on preloads for all other resources for which there is a known keying context.

I personally like this direction.

We probably need some experimentation with prefetching without credentials. @kinu, any insight? Given the potential issues there, it might make sense to expose to the server that the request is a prefetch. And maybe allow the response to influence UA processing of the response when navigation actually happens.

I'd like to explore Sec-Prefetch for this - WICG/nav-speculation#133 , I'm curious to hear your thoughts on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

10 participants