Make UAs support named character references in all XML docs#2056
Make UAs support named character references in all XML docs#2056sideshowbarker wants to merge 2 commits intomainfrom
Conversation
|
So this is basically #500. I'm supportive of this change, but it requires interest from implementers. |
384f772 to
24d620a
Compare
As far as I understand #500, it’s proposing adding a new public and/or system identifier in addition to XHTML1 and MathML identifiers the spec already lists (and which this PR removes). The change in this PR wouldn’t require authors to specify any special identifier at all (other than just That is, browsers would just always consistently recognize named character references in any document that ends up in the DOM with a document element in the HTML namespace—regardless of how it got there. |
|
Sorry, yeah, I understood your change. I meant that this change would remove the need for the request in that issue. |
|
I think if this change is OK, it should be OK to go all the way and just enable the entities always in XML. Why should they not work in SVG, Atom, etc? |
|
Oh yeah, I had not noticed this was limited somehow. I'm not even sure that limitation works in practice. |
But the HTML spec just defines requirements for HTML documents, right? It doesn’t (yet) anywhere attempt to define requirements for SVG or Atom or any other XML vocabularies. So as far as this change in this PR, are you thinking that we could/should drop the part that says “If the document element of a Document is in the HTML namespace”? And instead have it just say to do that regardless of what the namespace is? If so, IMHO it would be out of scope for the HTML spec to try to state that requirement for all of XML. And I think others (e.g., the SVG WG) would likely object to the HTML spec stating it. |
|
The current bit about DOCTYPEs already affects all of XML. |
Yeah I realize that now. Hadn’t thought it through before I responded. Anyway I’m not personally opposed to making the spec say the entities should work in UAs for any XML. So I’ll take a shot at refining the patch here to actually say that. |
The section the requirement is currently in is the “The XHTML syntax” section—specifically about XHTML and not about XML in general. So I am looking right now for where else in the spec to move it to. In the mean time, guidance welcome. |
|
Well, it says "Parsing XHTML documents" but then it actually defines "XML parser" within that section. Arguably we should rename those sections to more accurately reflect what is going on. Now nobody really cares about XHTML anymore that might be easier to do. |
OK yeah I’ve never been fond of continuing to forever call these document “XHTML documents”. Among other reasons I think when most authors see the term “XHTML document” they think it is still talking about XHTML1, not about anything we define in the current HTML spec. I think it would be better to instead consistently use something precise like “HTML documents served with an XML mime type” that makes it clear and unambiguous what we actually mean—and to forever retire the term “XHTML” as far as spec usage goes. Anyway I would be glad to retitle the entire “The XHTML syntax” section and to rework (or move) the contents of it—but it seems the change is this PR doesn’t need to wait on that. So for now in 2303f8d I moved the requirement about the entities to the Page load processing model for XML files. Lemme now if that works. |
|
I think the XML parser section is a better fit. Otherwise this would not work for XMLHttpRequest for instance. |
2303f8d to
b5b9bc6
Compare
OK b5b9bc6 restores it to there. (And #2062—which can first land without this—actually makes the section into being the XML parser section, by replacing XHTML in the section titles with just XML). |
|
I know @dominiccooney was working on XML in Blink. Maybe @hsivonen is the correct person to ask for Gecko? Any implementer interest? Seems like a nice simplification. |
| URL given by this link</a> (this URL is a DTD containing the <a | ||
| href="https://www.w3.org/TR/xml/#sec-entity-decl">entity declarations</a> for the names listed in | ||
| the <span>named character references</span> section), and should not attempt to retrieve any other | ||
| external entity's content. <ref spec=XML></p> |
There was a problem hiding this comment.
Seems we should make this a MUST if we do this?
|
I think the text in the PR isn't specific enough to explain what exactly is expected to happen. @sideshowbarker's comment here indicates that instead of this being an entity resolver hack within the constraints of XML conformance, this would involve patching an XML parser not to be conforming to XML. I'm reluctant to proceed for four reasons.
|
|
I suggest we close this. @sideshowbarker? |
Yup |
Authors choosing to serve HTML documents with an XML mime type shouldn’t forever be forced to put an obsolete XHTML1 doctype in their documents just to able to use named character references.
This relates to #2048 but is separated out in the interest ensuring this affects-browsers change doesn’t get overlooked by implementors in the midst of the doesn’t-affect-browsers changes in #2048. But I can fold it into #2048 if we think it’s more valuable to have all these changes in one PR.
💥 Error: Wattsi server error 💥
PR Preview failed to build. (Last tried on Jan 15, 2021, 7:57 AM UTC).
More
PR Preview relies on a number of web services to run. There seems to be an issue with the following one:
🚨 Wattsi Server - Wattsi Server is the web service used to build the WHATWG HTML spec.
🔗 Related URL
If you don't have enough information above to solve the error by yourself (or to understand to which web service the error is related to, if any), please file an issue.