Conversation
Fixes several issues with space handling:
1. "Control symbols" (as they're called in the TeXbook), such as `\\`, should
not have spaces eaten after them (only "control words" such as `\foo`).
2. In math mode, spaces should be consumed at the parser level, not the
gullet level. This enables `\\ [x]` to parse differently from `\\[x]`
3. Eat spaces between arguments, so `\frac x y` still works.
(This used to work only because math mode ate all spaces.
The analog in text mode wouldn't have worked.)
|
It looks like there's an issue with the |
* Add atom test. * Also use consumeSpaces helper more.
|
@kevinbarabash Thanks for spotting that. Indeed, it was a bug, and I added a test for it. Spaces needed to be checked for (and ignored) also in the sup/subscript handling of |
| const commentRegex = new RegExp(commentRegexString); | ||
| // tokenRegex has no ^ marker, as required by matchAt. | ||
| // These regexs are for matching results from tokenRegex, | ||
| // so they do have ^ markers. |
| */ | ||
| consume() { | ||
| this.nextToken = this.gullet.get(this.mode === "math"); | ||
| this.nextToken = this.gullet.get(false); |
There was a problem hiding this comment.
Since this is the only place where gullet.get is called and its parameter is always the same, maybe we should get rid of the parameter.
There was a problem hiding this comment.
Yes, I wanted a second opinion on this. As I wrote above: "we never use MacroExpander.get with a true argument, which could simplify the code of both get and unget." We no longer need any of the space saving/restoring mechanics, so I'll get rid of that.
"It also means that switchMode no longer does anything useful." (in the parser) This one I'm less sure of. Maybe switchMode would be useful in the future, e.g. if we can ever tweak catcodes in other ways? (e.g. verbatim or url modes?)
There was a problem hiding this comment.
If you think switchMode will come in handy in the future let's keep it, but please add a comment as to why we're keeping it.
There was a problem hiding this comment.
I think we still need switchMode because we store the current mode in the parse nodes and that value gets used in the builders. Ignore my comment about adding a comment.
|
|
||
| it("should allow cells in brackets", function() { | ||
| expect("\\begin{aligned}[a]&[b]\\\\ [c]&[d]\\end{aligned}") | ||
| .toParse(); |
There was a problem hiding this comment.
And if there's no space between \\\\and [c] would it try to parse [c] as a measurement?
There was a problem hiding this comment.
Yes. This matches LaTeX behavior, based on testing. (\@ifnextchar[ must get a space instead of a [.) I'll add an error test for the no-space case.
test/katex-spec.js
Outdated
| }); | ||
|
|
||
| it("should consume spaces after control-word function", function() { | ||
| compareParseTree("\\text{\\KaTeX x}", "\\text{\\KaTeX\\relax x}"); |
There was a problem hiding this comment.
So these two things parse the same, but I thought control words weren't allowed inside \text{}. Why is it important that these two parse the same?
There was a problem hiding this comment.
I wanted to make sure that \text{\KaTeX x} does not render a space. Ideally I'd say that \text{\KaTeX x} renders the same as \text{{\KaTeX}x} but that generates another group... perhaps I should tweak the test to actually look for features in the parse tree. (Control words like \KaTeX definitely work in text mode.) Ah, I can just test for \text{\KaTeX } vs. \text{\KaTeX}.
There was a problem hiding this comment.
I got an error when I tried \text{\KaTeX} on the demo page so maybe it's just some control words don't work.
| }); | ||
|
|
||
| // The following is not currently possible to get working, given that | ||
| // functions and macros are dealt with separately. |
There was a problem hiding this comment.
What's the current behavior? Can you open an issue for this?
There was a problem hiding this comment.
Opened #924. Also added some more comments in the code about this.
|
Thanks for the review! All comments should be addressed in the two new commits. |
k4b7
left a comment
There was a problem hiding this comment.
LGTM. Thanks for the additional tests and comments.
Continuation of KaTeX#912
|
Oops, I merged this before simplifying |
* Simplify get() now that we don't need it to ignorespaces Continuation of #912 * Remove commented-out code * Drop get() alias, rename unget() to pushToken(), use it
This PR fixes several issues with space handling, in particular fixing #910:
\\, shouldnot have spaces eaten after them (only "control words" such as
\foo).gullet level. This enables
\\ [x]to parse differently from\\[x]\frac x ystill works.(This used to work only because math mode ate all spaces.
The analog in text mode wouldn't have worked.)
Notably,
Parser.consumeno longer eats spaces when in math mode; rather, I get rid of them inparseExpression(which seems more natural -- though there is a worry that some code somewhere assumed that spaces would already be consumed). This implies that we never useMacroExpander.getwith atrueargument, which could simplify the code of bothgetandunget. It also means thatswitchModeno longer does anything useful. Is it worth removing any of this code? I'm unsure.