Rely on server-side breadcrumbs for detection and optimization#892
Conversation
|
@westonruter Would you like this PR to be reviewed and merged into the other PR's branch, or did you simply start work on this before and prefer #884 to be merged first? |
|
@felixarntz I'd like #884 to be merged first so that the PRs don't get excessively large. |
…ver-side-breadcrumbs * add/image-loading-optimization-optimizing: Add comment explaining why for loop is used Clarify LCP element language in comment
|
Note: I'm working on the tests for this in a sub-PR: #898. In the meantime, this PR can be reviewed (and merged). |
| * It would be nicer if this were like `/html[1]/body[2]` but in XPath the position() here refers to the | ||
| * index of the preceding node set. So it has to rather be written `/*[1][self::html]/*[2][self::body]`. |
There was a problem hiding this comment.
This being said, the breadcrumbs could be updated to keep track of the index of the tag of a given type rather than the index of the tag in general. This would make breadcrumbs more in line with the natural XPath syntax, so instead /*[1][self::html]/*[2][self::body] we could do just /html[1]/body[1].
Nevertheless, what is implemented now is what the :nth-child() pseudo-selector implements in CSS, as opposed to :nth-of-type().
But doing this would take some extra bookkeeping, and I don't see a clear benefit.
I also just came across a Chromium doc that is looking at the same problem space: Identifying element consistently across reloads.
felixarntz
left a comment
There was a problem hiding this comment.
This looks great to me! It's more reliable now that the source of truth is on the server - and while I don't like XPath, at least its usage here is simple enough (mostly just to compare against it) that I feel it doesn't make the code unapproachable. Also +1 to combining the logic for when to print the detection script to avoid duplicate logic.
Just a few minor points below.
| $breadcrumbs[] = array( | ||
| 'tag' => $breadcrumb_tag_name, | ||
| 'index' => $this->open_stack_indices[ $i ], | ||
| ); | ||
| yield array( $breadcrumb_tag_name, $this->open_stack_indices[ $i ] ); |
There was a problem hiding this comment.
Any particular reason for the data type change here? I find an associative array with named keys easier to understand than an indexed array that isn't really for a list of something.
There was a problem hiding this comment.
Mainly because it makes it more compact and slightly easier to iterate over per below:
foreach ( $this->get_breadcrumbs() as list( $tag_name, $index ) ) {
Originally too I had envisioned that a breadcrumbs could have other keys, like a class name or maybe some attributes to make it more like a CSS selector. But this didn't end up being the case, as we only ever need the tag name and index. So making it a tuple seems to make sense to me.
There was a problem hiding this comment.
I realize it's a few fewer lines of code, but it's harder to understand. Keys allow clarifying what something is, while now that's less apparent, particularly when looking at where the function is used and only finding list.
I'd prefer to go for clarity over compactness, but not a blocker as long as this function remains purely internal.
| if ( ! $post ) { | ||
| return $buffer; | ||
| } | ||
| $url_metrics = $post ? ilo_parse_stored_url_metrics( $post ) : array(); |
There was a problem hiding this comment.
Why remove the early return here? Isn't that more efficient if no URL metrics post is found?
There was a problem hiding this comment.
Good question. Because now if no URL metrics are found, we need to proceed and add breadcrumbs to the elements so that we can gather the URL metrics. You can see below too that the detection script is now injected in this function as well.
| ), | ||
| ), | ||
| ), | ||
| 'pattern' => '^(/\*\[\d+\]\[self::.+?\])+$', // See ILO_HTML_Tag_Processor::get_xpath() for format. |
There was a problem hiding this comment.
Just a suggestion: Make this a const on ILO_HTML_Tag_Processor since that's where the definition really comes from.
Summary
See #869.
Eliminate client-side computation of element breadcrumbs in favor of generating them server-side while processing the output buffer. Breadcrumbs are serialized as XPath and added to image elements as a
data-ilo-xpathattribute. This data attribute is used by the detection script instead of generating breadcrumbs client-side. Lastly, the detection script is now injected as part of the output buffer processing instead of a separatewp_footercode path.Relevant technical choices
This improves the reliability of connecting elements from client-side detection with server-side optimization by exclusively computing breadcrumbs XPath on the server. This eliminates the issue encountered in #884 in which JavaScript injecting a new element. Namely it address the following todo from #884 (comment):
As part of this, XPath strings are used throughout the module rather than comparing breadcrumbs arrays.
Additionally, instead of injecting the detection script at
wp_footerand then having some of the same conditions during output buffer processing for optimization, the detection script is now injected during optimization.Checklist
[Focus]orInfrastructurelabel.[Type]label.no milestonelabel.