bpo-32874: IDLE: add tests for pyparse #5755

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

terryjreedy merged 2 commits into python:master from csabella:pyparse

Feb 22, 2018

Contributor

csabella commented Feb 19, 2018 •

edited by bedevere-bot

Loading

https://bugs.python.org/issue32874


          bpo-32874: IDLE: add tests for pyparse

3829e67

csabella requested a review from terryjreedy as a code owner

February 19, 2018 16:02

the-knights-who-say-ni added the CLA signed label

bedevere-bot added the awaiting review label

Member

terryjreedy commented Feb 19, 2018

I am reviewing now and expect to push at least a few changes. Coverage is currently an excellent 97%. The misses are mostly conditions within Parse methods. If a check is eally needed, and we understand why, then adding a test case that goes the other way should be easy. If I see how, I will add such.

149 def find_good_parse_start:
173 for tries in range(5):  # Never jumps? to 183.  This note on for statement is new and unclear.
193 if m and not is_char_in_string(m.start()): # Never true, next line never executed.

229 def _study1(self): 
272 if level == 0:  # (if not level? always true
282 if level:   # always true
308 if w == 0:  # always true, 313 continue never executed
310 if level == 0:  # always true
339 continue # supposedly not executed, but must be when prior assert passes

533 def compute_backslash_indent(self): 
557 if level:  # always true
562 elif ch == '#':  # never true, hence break never executed

Contributor Author

csabella commented Feb 20, 2018

For the coverage, I couldn't figure out how to get that last call to is_char_in_string to return false if the previous returns had been true. From the comments, the idea of the extra loop is to allow the colorizer to catch up; meaning, on the first call it might be true, but a few milliseconds later it might be false, even calling on the same string and index. Since is_char_in_string is a function call passed to find_good_parse_start, the value returned would have to be non-deterministic. I thought of having the function flip itself after the first 5 tries (to get through the for loop) and, with a side effect, that could probably be done. I just didn't know if that was the right approach.

Member

terryjreedy commented Feb 20, 2018

I won't push anything tonight. I spent most of my time understanding the parse attribute initialization 'bug' and opening a follow-up issue. In part, I wanted to know if it could lead to buggy certain types of input.

Member

terryjreedy commented Feb 20, 2018

OK, reviewing again. I will try to not get sidetracked by possible code changes.


          minor changes

f39d51f

terryjreedy approved these changes

View reviewed changes

Member

terryjreedy left a comment

I am mostly ready to merge this and move on to editing pyparse But I would like answer to #nospace puzzle if you have one, even if I merge first.

Lib/idlelib/idle_test/test_pyparse.py

    
                      eq(start(is_char_in_string=lambda index: index >= 44), 33)

                      # If everything before the 'def' is in a string, then returns None.

                      # The non-continuation def line returns 44 (see below).

                      eq(start(is_char_in_string=lambda index: index < 44), None)

Member

terryjreedy Feb 21, 2018

Since 44 < 44 is False, this strikes me as a bug, though I do not yet see why None rather than 44 is returned.

Contributor Author

csabella Feb 21, 2018

In the tries for loop, the first thing it does is rfind for ':\n', so in this example, that's the end of the def statement (pos 109). It then finds the start of the line as rfind('\n'), which does not return the start of the line that actually has the def on it (it returns pos 70 instead of 44). _synchre doesn't find a keyword, so it gets ready to loop again. On the next loop, it again does rfind for ':\n', which is the end of the class statement. But the class statement is in a string (pos 33), so that's no good. This means it enters if pos is None:. Since that invokes _synchre from the beginning of the string, it again finds class and again rejects it.

One goal of this routine seems to be to find a better start point as quickly as possible. It sacrifices 'being perfect' for just trying a few (5) times. So, I don't know if this is a big deal that it doesn't find it. The downstream effect is that _study2 (and to a lesser extent _study1) might have to process more of the string instead of the 'best guess' of the tail of the string.

Member

terryjreedy Feb 22, 2018

Thanks. I missed the point that def is only found in the 'while 1' loop after backing up to class and searching forward. See msg312525 of bpo-32880 for more thoughts.

Lib/idlelib/idle_test/test_pyparse.py

    
                      (NONE, BACKSLASH, FIRST, NEXT, BRACKET) = range(5)

                      TestInfo = namedtuple('TestInfo', ['string', 'goodlines',

                                                         'continuation'])

Member

terryjreedy Feb 21, 2018

Just curious, why do you prefer namedtuples (which are fine) over tuples unpacked in the for loop header?

Contributor Author

csabella Feb 21, 2018

I originally did it with unpacking, but felt that the _study2 tests were unreadable because there were so many elements in the tuple. Adding the namedtuple name to the creation on each line seemed to do the trick on the readability by at least making each test line distinguishable. Once I converted that one and it didn't slow down the tests, I did the others for consistency. Switching back and forth between styles when reading the code seemed that it might require just a little more brain power and keeping it consistent let the brain know, 'ok, i've seen this before'. Aren't you glad you asked? lol :-)

Member

terryjreedy Feb 22, 2018

good enough

Lib/idlelib/idle_test/test_pyparse.py

    
                      setstr = p.set_str

                      study = p._study1

                      (NONE, BACKSLASH, FIRST, NEXT, BRACKET) = range(5)

Member

terryjreedy Feb 21, 2018

I think I like these better than the C_NAMES in the file.

Contributor Author

csabella Feb 21, 2018

:-) Since this is from the early 2000s, maybe using C_ for constant instead of just caps was more of the accepted style? Should it be added to theTODO list?

Lib/idlelib/idle_test/test_pyparse.py

    
                          TestInfo('())\n', [0, 1], NONE),                    # Extra closer.

                          TestInfo(')(\n', [0, 1], BRACKET),                  # Extra closer.

                          # For the mismatched example, it doesn't look like contination.

                          TestInfo('{)(]\n', [0, 1], NONE),                   # Mismatched.

Member

terryjreedy Feb 21, 2018

I did not yet read the _study1 while loop, but these answers are intelligible, given the string, even though understanding the function definition will require looking at the use context.

Contributor Author

csabella Feb 21, 2018

From what I can tell _study1 is all about continuation type. Since the mismatch example isn't code that's likely to happen 'in the wild', I don't think it's likely to cause issues. But, if someone was playing around with the editor, they might be surprised that a non-matching bracket might close a '('.

For example, in an IDLE editor,

{'a': [1, 2, 3}

indents 1 space upon Enter. I know there's not a linter to mark it as a syntax error, but I think users might expect for the next line to align with the { because they thought they finished the dictionary.

Member

terryjreedy Feb 22, 2018

See msg 312526. Except for possible bugs, mis-indents are a mark of syntax error ;-)

Lib/idlelib/idle_test/test_pyparse.py

    
                      tests = (

                          TestInfo('', 0, 0, '', None, ((0, 0),)),

                          TestInfo("'''This is a multiline continutation docstring.\n\n",

                                   0, 49, "'", None, ((0, 0), (0, 1), (49, 0))),

Member

terryjreedy Feb 21, 2018

The _study2 docstring says lastch is 'last non-whitespace char' before comment while comment within _study2 says 'last interesting char'. So I expected '.' But it appears in the code that chars within a string, after the initial quote, are uninteresting. I changed 'non-whitespace' to 'interesting', and may clarify that in next issue.

Lib/idlelib/idle_test/test_pyparse.py

    
                                   0, 11, '', None, ((0, 0), (0, 1), (11, 0))),

                          # A comment without a space is a special case

                          TestInfo('    #Comment\\\n',

                                   0, 0, '', None, ((0, 0),)),

Member

terryjreedy Feb 21, 2018

This strikes me as a bug. I made both samples start with either no spaces or one space and get same results. For the 2nd, I verified that ch in never '#', so that "if ch == '#':" is never triggered, but I cannot see why or how it gets skipped. I am inclined to disable this case.

Contributor Author

csabella Feb 21, 2018 •

edited

Loading

It was for coverage. This regex doesn't work with a space after the #.

_junkre = re.compile(r"""
    [ \t]*
    (?: \# \S .* )?
    \n
""", re.VERBOSE).match

Lib/idlelib/idle_test/test_pyparse.py

    
                          TestInfo('def function1(self, a):\n    pass\n', no),

                          TestInfo('# A comment:\n', no),

                          TestInfo('"""A docstring:\n', no),

                          TestInfo('"""A docstring:\n', no),

Member

terryjreedy Feb 21, 2018

These are mostly redundant and the only thing the function does beyond calling _study2 is switch on ':'

Contributor Author

csabella Feb 21, 2018

Yes, I was just showing that a : inside a comment or docstring doesn't count, in case the code was changed to do the check some way besides :.

Lib/idlelib/idle_test/test_pyparse.py

    
                      tests = (

                          TestInfo('', ((0, 0),)),

                          TestInfo('a\n', ((0, 0),)),

                          TestInfo('()()\n', ((0, 0), (0, 1), (2, 0), (2, 1), (4, 0))),

Member

terryjreedy Feb 21, 2018

I understand that the code produces 4 rather than 3 but I wonder if the caller really cares.
These really test _study2 more, but with cases where we only care about the bracketing.

Contributor Author

csabella Feb 21, 2018

I actually intended this test to show the level (second element) increase and decrease on closers. I didn't look at how it was used, but just tried to make tests for what it was returning.

Lib/idlelib/idle_test/test_pyparse.py

    
                              setstr(test.string)

                              test.assert_(closer())

                  def test_get_last_open_bracket_pos(self):

Member

terryjreedy Feb 21, 2018

I hope to delete this with the function later, but until then, will leave it.

bedevere-bot added awaiting merge and removed awaiting review labels

csabella commented

View reviewed changes

Lib/idlelib/idle_test/test_pyparse.py

    
                  def test_tran(self):

                      self.assertEqual('\t a([{b}])b"c\'d\n'.translate(self.parser._tran),

                                        'xxx(((x)))x"x\'x\n')

Contributor Author

csabella Feb 21, 2018

I was wondering if explicit tests should be added for that. I don't know if compressing a string like this is common so that it can be parsed faster, but I learned a lot from this technique.

Member

terryjreedy Feb 22, 2018

Abstractly, mappings implement functions (in the set-theoretic sense). Copying the code that produces the mapping is not terribly useful, but in this case, I discovered a single 'actual = expected' test.

csabella commented

View reviewed changes

Lib/idlelib/pyparse.py

    
                          if i < 0:

                              break

                          i = str.rfind('\n', 0, i) + 1  # start of colon line

                          i = str.rfind('\n', 0, i) + 1  # start of colon line (-1+1=0)

Contributor Author

csabella Feb 21, 2018 •

edited

Loading

I thought this (-1+1=0) was a clever trick when I debugged it. :-)

Member

terryjreedy commented Feb 22, 2018

The no-space puzzle I mentioned above is this one (I forgot to push the 'Add comment' button).

        TestInfo(' # Comment\\\n',
                 0, 12, '', None, ((0, 0), (1, 1), (12, 0))),
        # A comment without a space is a special case
        TestInfo(' #Comment\\\n',
                 0, 0, '', None, ((0, 0),)),

Why the difference?

terryjreedy added needs backport to 3.6 labels

terryjreedy merged commit c84cf6c into python:master

Contributor

miss-islington commented Feb 22, 2018

Thanks @csabella for the PR, and @terryjreedy for merging it 🌮🎉.. I'm working now to backport this PR to: 3.6, 3.7.
🐍🍒⛏🤖

bedevere-bot removed the awaiting merge label

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request


          bpo-32874: IDLE: add tests for pyparse (pythonGH-5755)

1bd57bb

There are no code changes other than comments and docstrings.
(cherry picked from commit c84cf6c)

Co-authored-by: Cheryl Sabella <[email protected]>

bedevere-bot commented Feb 22, 2018

GH-5803 is a backport of this pull request to the 3.7 branch.

bedevere-bot removed needs backport to 3.7 labels

bedevere-bot commented Feb 22, 2018

GH-5804 is a backport of this pull request to the 3.6 branch.

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request


          bpo-32874: IDLE: add tests for pyparse (pythonGH-5755)

6138a8e

There are no code changes other than comments and docstrings.
(cherry picked from commit c84cf6c)

Co-authored-by: Cheryl Sabella <[email protected]>

miss-islington added a commit that referenced this pull request


          bpo-32874: IDLE: add tests for pyparse (GH-5755)

c59bc98

There are no code changes other than comments and docstrings.
(cherry picked from commit c84cf6c)

Co-authored-by: Cheryl Sabella <[email protected]>

miss-islington added a commit that referenced this pull request


          bpo-32874: IDLE: add tests for pyparse (GH-5755)

52064c3

There are no code changes other than comments and docstrings.
(cherry picked from commit c84cf6c)

Co-authored-by: Cheryl Sabella <[email protected]>

Contributor Author

csabella commented Feb 22, 2018

Not sure if you saw my previous comment about the "no-space issue".

Why the difference?

If you're asking why there is a different result for the tests, it's because this regex doesn't work with a space after the #.

_junkre = re.compile(r"""
    [ \t]*
    (?: \# \S .* )?
    \n
""", re.VERBOSE).match

If your 'why' is why _junkre is written like that, I haven't been able to figure it out.

csabella deleted the pyparse branch

February 22, 2018 13:08

Member

terryjreedy commented Feb 23, 2018 •

edited

Loading

After looking at the effect on smart indenting, I decided that ignoring '#x' as junk is the wrong case. See bpo-32918

terryjreedy mentioned this pull request

IDLE: make smart indent after comment line consistent #77099

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet