Skip to content

Conversation

@ndossche
Copy link
Member

Fixes GH-10634

We're not relying on re2c's bounds checking mechanism because re2c:yyfill:check = 0; is set. We just return 0 if we read over the end of the input in YYFILL. Note that we used to use the "any character" wildcard in the comment regexes.
But that means if we go over the end in the comment regexes, we don't know that and it's just like the 0 bytes are part of the token. Since a 0 byte already is considered as an end-of-file, we can just block those in the regex.

For the regexes with newlines, I had to not only include \x00 in the denylist, but also \n and \r because otherwise it would greedily match those and let the single-line comment run over multiple lines.

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@iluuu1994
Copy link
Member

Perfect, thank you! Feel free to merge. Maybe you could also add a short comment on why the \x00 is there for the next person that encounters it :)

We're not relying on re2c's bounds checking mechanism because
re2c:yyfill:check = 0; is set. We just return 0 if we read over the end
of the input in YYFILL. Note that we used to use the "any character"
wildcard in the comment regexes.
But that means if we go over the end in the comment regexes,
we don't know that and it's just like the 0 bytes are part of the token.
Since a 0 byte already is considered as an end-of-file, we can just block
those in the regex.

For the regexes with newlines, I had to not only include \x00 in the
denylist, but also \n and \r because otherwise it would greedily match
those and let the single-line comment run over multiple lines.
@ndossche ndossche merged commit ac99645 into php:master Mar 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Lexing memory corruption

2 participants