Skip to content

Conversation

@isidentical
Copy link
Member

@isidentical isidentical commented Oct 31, 2020

{
Token *t = _PyPegen_expect_token(p, NAME);
if (t == NULL) {
if (t == NULL || p->error_indicator) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can _PyPegen_expect_token() return non-NULL if error is set?

Either _PyPegen_name_token() is called when error is set (it should not, it should be checked earlier), or the problem is in _PyPegen_fill_token().

Add assert(!PyErr_Occurred()) at the top of _PyPegen_expect_token() and if it does not crash, search bug in _PyPegen_fill_token().

Copy link
Member Author

@isidentical isidentical Oct 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can _PyPegen_expect_token() return non-NULL if error is set?

As far as I understand, _PyPegen_expect_token returns NULL on 2 different cases. If there is an error, it returns NULL and if the given type is not matched (e.g: NAME given but NUMBER is the next token) it also returns NULL. But on the latter, it returns NULL without setting an error. So when parser is trying alternatives and backtracking, it assumes it got a different type of token and continues.

One thing that I thought was adding

    if (p->error_indicator) {
        return NULL;
    }

at the begging of _PyPegen_expect_token but I failed to find another case where the error didn't propagate except the one that author suggested so this might be enough.

self._check_error(code, "invalid syntax")

def test_unexpected_line_continuation(self):
self._check_error('A.\u018a\\ ', "unexpected character after line continuation character")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would test two cases: \\\n and \\ followed by non-\n character. They are two slightly different errors.

Instead \u018a you can use any non-ascii character.

@lysnikolaou
Copy link
Member

I think that this is too specialized a fix for this specific situation. I've investigated a bit more in depth and I think I found a problem there is with left-recursive rules, not handling p->error_indicator correctly. I've got a patch ready. @isidentical Would it be okay to close this PR?

@isidentical
Copy link
Member Author

Sure!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants