Optimize the loading of images using stored URL metrics#884
Conversation
…or URL metrics with no LCP element
|
This is now working!
Elements that have been muted get annotated with Items to further work on:
For a future PR:
|
| } | ||
| if ( | ||
| children.includes( | ||
| document.querySelector( '.skip-link.screen-reader-text' ) |
There was a problem hiding this comment.
This should be cached.
There was a problem hiding this comment.
I'm going to leave this as-is for this PR. In the next PR I intend to eliminate client-side breadcrumb construction since it is too easy for JavaScript DOM mutations to cause client-side generated breadcrumbs from being able to apply on the server.
There was a problem hiding this comment.
Once you switch to generate breadcrumbs on the server, maybe a shorter index or hash passed to the front end?
There was a problem hiding this comment.
Yes, exactly. Currently I'm thinking that when detection is needed that the server would add breadcrumbs via a data attribute, for example:
<img data-ilo-breadcrumbs="html,0 body,1 main,2 figure,10 img,0" ...>When detection is not needed, such data attributes would not be present.
There was a problem hiding this comment.
Can you add a TODO comment? :)
There was a problem hiding this comment.
There is already one:
modules/images/image-loading-optimization/class-ilo-html-tag-processor.php
Outdated
Show resolved
Hide resolved
dmsnell
left a comment
There was a problem hiding this comment.
@westonruter there's quite a bit in this PR! I can't say that I fully understanding it, but as far as the HTML Processor goes I'm glad you're experimenting with the lower-level tooling for your own needs. plugins are the right place to push ahead with riskier code.
it hasn't been my intention to expose which nth-child a node is compared to the parent, but I was going to build a CSS selector that queries that when prompted. maybe that's a good thing to build in from the start.
my outlook has been positive on the performance side for the eventual HTML Processor, but it's still opaque until it supports everything it needs to 🤷♂️
modules/images/image-loading-optimization/class-ilo-html-tag-processor.php
Outdated
Show resolved
Hide resolved
modules/images/image-loading-optimization/class-ilo-html-tag-processor.php
Show resolved
Hide resolved
| // TODO: There are quite a few more cases of optional closing tags: https://html.spec.whatwg.org/multipage/syntax.html#optional-tags | ||
| // Nevertheless, given WordPress's legacy of XHTML compatibility, the lack of closing tags may not be common enough to warrant worrying about any of them. | ||
| if ( in_array( $tag_name, self::P_CLOSING_TAGS, true ) ) { | ||
| $i = array_search( 'P', $this->open_stack_tags, true ); |
There was a problem hiding this comment.
this is probably fine enough, but be aware that there are certain markers/boundaries in the stack that effectively isolate regions of HTML. for example, if we're currently inside a TEMPLATE element and encounter one of these P-closing elements, they don't close the P from the outside of the template.
<p>
This template is isolated.
<template><section><p>Hidden</p></section></template>
The outer P remains.
</p>| $i = array_search( $tag_name, $this->open_stack_tags, true ); | ||
| if ( false !== $i ) { | ||
| array_splice( $this->open_stack_tags, $i ); | ||
| $did_splice = true; |
There was a problem hiding this comment.
if encountering an element which isn't an HTML Void element, then regardless of the tag, if the self-closing flag is present, then it closes the tag. all self-closing flags on HTML elements are invalid and ignored, but all on HTML foreign elements are authoritative and obeyed.
<p>There are no empty <div/> tags. This is inside the DIV</p>.
<wp-group>Is a custom element, and so <wp-group /> is self-closing while others aren't.</wp-group>if (
WP_HTML_Processor::is_void( $tag_name ) ||
(
! WP_HTML_Processor::is_html_element( $tag_name ) &&
$p->has_self_closing_flag
)
) {
// this tag immediately closes as soon as we jump to the next tag.
}this requires having a list of all HTML elements, which the HTML Processor currently doesn't do because it doesn't support any, but we'll have to add it. this can lead to common parsing failures because the invalid self-closing flag has become more popular post-React where it's valid and normative in JSX.
There was a problem hiding this comment.
Very interesting. I wasn't aware of the WP_HTML_Processor::has_self_closing_flag() method.
Nevertheless, maybe having a list of all HTML elements isn't needed here because: (1) if math or svg are ancestors, we can assume that all tags with self-closing flags will close the tag, and (2) custom elements always have hyphens in them, so if present we can also honor the self-closing tag.
Nevertheless, I just checked your example and it doesn't seem the second example with a custom element is actually true. I adapted your example to use a span instead of div (since a div closes an open p):
<p>There are no empty <span/> tags. This is inside the SPAN</p>.
<wp-group>Is a custom element, and so <wp-group /> is self-closing while others aren't.</wp-group>And Chrome renders this as:
So perhaps such self-closing foreign elements are limited to MathML and SVG contexts?
There was a problem hiding this comment.
by golly, you're right. I have been misinterpreting "foreign element" for a long time now! thankfully I haven't gotten to the place in the HTML Processor where that matters.
thank you very much for pointing this out.
|
|
||
| $link_tag .= sprintf( ' %s="%s"', $name, esc_attr( $value ) ); | ||
| } | ||
| $link_tag .= ">\n"; |
There was a problem hiding this comment.
this is also a reasonable place for the Tag Processor, though obviously not essential. it simply removes the need to remember to escape, and joins together the split tag/html/php syntax.
$link_tag = new WP_HTML_Tag_Processor( '<link data-ilo-added-tag rel="preload" fetchpriority="high" as="image">' );
foreach ( array_filter( $img_attributes ) as $name => $value ) {
switch ( $name ) {
case 'srcset':
case 'sizes':
$name = "image{$name}";
break;
case 'src':
$name = 'href';
break;
}
$link_tag->set_attribute( $name, $value );
}
$link_html = $link_tag->get_updated_html();I'm hoping to have basic templating in place for 6.5, which should turn this into something potentially nicer still.
$attributes = [];
foreach ( array_filter( $img_attributes ) as $name => value ) {
switch ( $name ) {
case 'srcset':
case 'sizes':
$attributes[ "image{$name}" ] = $value;
continue;
case 'src':
$attributes['href'] = $value;
continue;
}
$attributes[ $name ] = $value;
}
$link_tag = WP_HTML::render(
'<link data-ilo-added-tag rel="preload" fetchpriority="high" as="image" ...attrs>',
array( 'attrs' => $attributes )
);Co-authored-by: Adam Silverstein <adamjs@google.com>
felixarntz
left a comment
There was a problem hiding this comment.
@westonruter This looks great so far. I left a few comments, mostly for minor optimization. My biggest concern/question is in regards to the custom tag processor class, I'm not sure why that needs to be included here.
| const VOID_TAGS = array( | ||
| 'AREA', | ||
| 'BASE', | ||
| 'BASEFONT', // Obsolete. | ||
| 'BGSOUND', // Obsolete. | ||
| 'BR', | ||
| 'COL', | ||
| 'EMBED', | ||
| 'FRAME', // Deprecated. | ||
| 'HR', | ||
| 'IMG', | ||
| 'INPUT', | ||
| 'KEYGEN', // Obsolete. | ||
| 'LINK', | ||
| 'META', | ||
| 'PARAM', // Deprecated. | ||
| 'SOURCE', | ||
| 'TRACK', | ||
| 'WBR', | ||
| ); | ||
|
|
||
| /** | ||
| * The set of HTML tags whose presence will implicitly close a <p> element. | ||
| * For example '<p>foo<h1>bar</h1>' should parse the same as '<p>foo</p><h1>bar</h1>'. | ||
| * | ||
| * @link https://html.spec.whatwg.org/multipage/grouping-content.html#the-p-element | ||
| * | ||
| * @var string[] | ||
| */ | ||
| const P_CLOSING_TAGS = array( | ||
| 'ADDRESS', | ||
| 'ARTICLE', | ||
| 'ASIDE', |
There was a problem hiding this comment.
Why do we need these lists here? What specifically in regards to void and p closing tags is needed here, and why? Is there a bug in the WP_HTML_Tag_Processor, or crucial limitation?
It would be great if you could clarify the need for all this in relation to this module. Right now I'm wondering whether this module should need all those functionalities - like why are those special tags relevant for image optimization?
There was a problem hiding this comment.
These are needed to be able to iterate over tags in the document and be able to correctly keep track of the open tag stack. The WP_HTML_Tag_Processor is intentionally very limited in what processing it supports. Namely, it does not have any sense of how tags are related to each other (e.g. that html is a parent of body). It is a streaming parser that iterates over open tags and close tags. In order to keep track of when one tag is a child of another tag, we have to know the semantics of these tags, whether they are self-closing (void) or how they relate to tags that have optional closers (e.g. here p).
The superset of functionality from WP_HTML_Tag_Processor is in WP_HTML_Processor which does have capabilities regarding document structure. However, it can't be used yet because it doesn't support all HTML tags yet. Eventually this class shouldn't be needed as WP_HTML_Processor. I've added a comment to reflect this in 0cca4dd.
There was a problem hiding this comment.
And yeah, these lists of tags are all related to image optimization in that we need to know them in order to accurately construct breadcrumbs for any given tag.
There was a problem hiding this comment.
Thanks, that makes sense. I wonder how this version of the tag processor is for performance.
For the plugin we don't need to worry about this too much, but for a potential core proposal we would need to spend more time on this - particularly as I'm not sure we would want two separate versions of the tag processor in core. And we will only be able to use this one throughout if it doesn't have notable negative performance implications.
There was a problem hiding this comment.
It should have about the same performance as WP_HTML_Tag_Processor as it is just iterating over tags and is doing much less than WP_HTML_Processor, which is less predictable for performance.
| } | ||
| if ( | ||
| children.includes( | ||
| document.querySelector( '.skip-link.screen-reader-text' ) |
There was a problem hiding this comment.
Can you add a TODO comment? :)
@felixarntz The custom tag processor is needed in order to be able to obtain breadcrumbs for any given element. Without it, we wouldn't be able to apply optimizations for elements identified during detection. |
…cessor is fully implemented
| function ilo_construct_breadcrumbs_string( array $breadcrumbs ): string { | ||
| return implode( | ||
| ' ', | ||
| array_map( | ||
| static function ( $breadcrumb ) { | ||
| return sprintf( '%s,%s', $breadcrumb['tag'], $breadcrumb['index'] ); |
There was a problem hiding this comment.
In another PR I'm working on, I'm going to make this into an XPath constructor.
| function ilo_construct_breadcrumbs_string( array $breadcrumbs ): string { | |
| return implode( | |
| ' ', | |
| array_map( | |
| static function ( $breadcrumb ) { | |
| return sprintf( '%s,%s', $breadcrumb['tag'], $breadcrumb['index'] ); | |
| function ilo_construct_breadcrumbs_xpath( array $breadcrumbs ): string { | |
| return implode( | |
| '', | |
| array_map( | |
| static function ( $breadcrumb ) { | |
| return sprintf( '/%s[%d]', $breadcrumb['tag'], $breadcrumb['index'] ); |
| * @param array<array{tag: string, index: int}> $breadcrumbs Breadcrumbs. | ||
| * @return string Breadcrumb string. | ||
| */ | ||
| function ilo_construct_breadcrumbs_string( array $breadcrumbs ): string { |
There was a problem hiding this comment.
Function moved to ILO_HTML_Tag_Processor method in #892
Co-authored-by: Felix Arntz <felixarntz@users.noreply.github.com>
felixarntz
left a comment
There was a problem hiding this comment.
@westonruter Thanks for the updates, LGTM!
|
It would be nice if pull requests that aren't targeting |
|
@adamsilverstein or @swissspidy could either of you drop in a LGTM? |


Summary
Fixes #872 as part of #869.
This implements the optimization of image loading based on the client-side detection (#876) that has been stored (#878).
This is fully functional and able to be tested in a WP install.
Relevant technical choices
When a response is eligible for being optimized, an output buffer filter is added to apply optimizations. This involves looking up the
ilo_url_metricspost containing the stored URL metrics, and when present groupong the URL metrics into the breakpoints (480px, 600px, 782px). If all breakpoints have URL metrics collected and all share the same image as LCP element, thenfetchpriority=highis set on this image while being removed from all other images. If all breakpoints don't have URL metrics collected yet, then thefetchpriorityattribute added by server-side heuristics is not removed. Certain data attributes are also added to elements so we can track what operations are being performed:data-ilo-added-fetchpriorityfetchpriorityattribute is added to an element.data-ilo-removed-fetchpriorityfetchpriorityattribute is removed from an element.data-ilo-fetchpriority-already-addedfetchpriority=highon the LCP image, then thedata-ilo-removed-fetchpriorityattribute is added.data-ilo-added-tagpreloadlinks which are injected at the end of thehead.When the
fetchpriorityattribute is added, adata-ilo-added-fetchpriorityattribute is also added so we can keep track of what operations it is doing.Whenever there are URL metrics available,
preloadlinks are also added for the LCP image in each breakpoint. This is particularly important when different breakpoints have different LCP elements (even none at all), as in such cases thefetchpriorityattribute must be removed from each breakpoint-specific LCP image element. This is also useful for when breakpoints don't have URL metrics collected yet, as otherwisefetchpriority=highcould be erroneously added to the wrong image for the current viewport.In order to apply the
fetchpriorityoptimizations to the page and to collect the attributes fromIMGelements for use in preload links, theWP_HTML_Tag_Processorclass is leveraged, although it is not used directly. AnILO_HTML_Tag_Processorclass is used which incorporatesWP_HTML_Tag_Processor(by composition instead of extension). This processor is similar toWP_HTML_Tag_Processorexcept it is more constrained. It provides a generator method calledopen_tags(); while iterating over this generator, the breadcrumbs for the current open tag can be obtained via theget_breadcrumbs()method. Otherwise, the following methods from the underlyingWP_HTML_Tag_Processorare exposed:get_attribute()set_attribute()remove_attribute()get_updated_html()Importantly, the
next_tag()andseek()methods are not exposed since in order to correctly compute breadcrumbs the processor must proceed forward through the entire document visiting all open and closing tags.Tip: The following command deletes all
ilo_url_metricsposts:npm run wp-env run cli -- post delete --force $( npm run --silent wp-env run cli -- post list --post_type=ilo_url_metrics --format=ids 2> /dev/null )Other changes
mobile,small, andmediumbreakpoints that are defined in Gutenberg's_breakpoints.css.GETrequest (e.g. aPOSTrequest)..skip-link.screen-reader-textelement, which throws off generation of breadcrumbs on the client to match breadcrumbs generated on the server. In reality, the generation of breadcrumbs client-side is fragile and in a subsequent PR I intend to rely on server-side breadcrumb generation exclusively.preflab_prefixes have been replaced withilo_.tagNamekey has been replaced withtagin breadcrumbs.Checklist
[Focus]orInfrastructurelabel.[Type]label.no milestonelabel.