-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Improving rooted_tree_isomorphism for deep trees #7945
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving rooted_tree_isomorphism for deep trees #7945
Conversation
rossbar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amcandio - switching from recursive->iterative implementation is indeed an improvement as the test indicates!
Just a couple suggestions for minor (potential) improvements.
Also since this "helper function" is only used in one place, we could also consider removing it and just moving the implementation to the appropriate location inside rooted_tree_isomorphism. But that can be handled/proposed in a follow-up PR if you prefer to focus on these changes here!
| # Start with the initial pair in the stack | ||
| stack = [(v, w)] | ||
| while stack: | ||
| curr_v, curr_w = stack.pop() | ||
| assert curr_v < curr_w | ||
| M.append((curr_v, curr_w)) | ||
| # Zip children and push them in reverse order to maintain processing order. | ||
| stack.extend( | ||
| zip(reversed(ordered_children[curr_v]), reversed(ordered_children[curr_w])) | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also +1 for switching to an iterative implementation. Another minor thought though: having to reverse the children every time is cumbersome. Perhaps a deque with popleft would do the trick?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reverse is needed so my change doesn't modify any previous behavior (i.e. isomorphism mapping is a list and I'm maintaining the order). But if people is okay with changing that I can replace it with a nicer BFS implementation like:
def generate_isomorphism(v, w, ordered_children):
ret = [(v, w)]
next_index = 0
while next_index < len(ret):
curr_v, curr_w = ret[next_index]
assert curr_v < curr_w
next_index += 1
ret.extend(zip(ordered_children[curr_v], ordered_children[curr_w]))
return retTo be honest, I don't think any one should depend on the order of the isomorphism list but you never know
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was you can do away with the reversed by using a deque to implement the stack instead of a list - here's a quick example:
>>> from collections import deque
>>> lstack = [] # the way it is now
>>> lstack.extend(reversed(range(3)))
>>> while lstack:
... print(lstack.pop())
0
1
2
# Now, with a deque
>>> dstack = deque()
>>> dstack.extend(range(3)) # extend with objects, no reversal
>>> while dstack:
... print(dstack.popleft()) # Use pop-left to implement FIFO
0
1
2There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that previous recursive implementation naturally does a DFS which is LIFO. The deque will make the algo do a BFS which will change the output order.
To avoid the reversal we would need a stack of dequeues so we do LIFO on the different levels of the tree and FIFO on nodes with the same parent:
from collections import deque
stack = [deque([(v, w)])]
while stack:
last = stack[-1]
if not last:
stack.pop()
continue
curr_v, curr_w = last.popleft()
M.append((curr_v, curr_w))
stack.append(deque(zip(ordered_children[curr_v], ordered_children[curr_w]))I can make this change but might be better to drop the order backwards compatibility and simplify the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
[Edit: It's actually slower and has to create a list in the middle -- so probably more memory.... ]
Nevermind! :{
while stack:
curr_v, curr_w = stack.pop()
M.append((curr_v, curr_w))
stack.extend(
reversed(list(zip(ordered_children[curr_v], ordered_children[curr_w])))
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about that and didn't do it bc of the extra memory. Also reversed(zip(...)) throws TypeError:
In [60]: reversed(zip([1,2,3], [1,2,3]))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In [60], line 1
----> 1 reversed(zip([1,2,3], [1,2,3]))
TypeError: 'zip' object is not reversibleThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay, I see what you mean now - I missed that there are two structures defined outside the scope of the function that are being accessed inside (all the more reason to get rid of the helper function definition IMO!)
In that case, I think it's slightly cleaner if we instead move the reversal to the sorted call on L193 where the ordered_children is originally populated. It's probably also worth adding a comment along with the reverse=True to explain why, something like: # reverse=True to preserve DFS order - see gh-7945.
Of course - this is still a nit, not a blocker!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's a good idea. Will address in a follow-up PR
Making test_long_paths_graphs more performant Co-authored-by: Ross Barnowski <rossbar@caltech.edu>
Thanks! By this you mean just declaring the function inside of |
|
Replied to comments, I think they make sense. Is it okay if I take care of them in a separate PR? |
|
I think he meant: just move the guts of the Also, this helper function is not part of the public API -- it is just old enough that it doesn't have a leading underscore. So we are not worried about users who might be using it directly. Finally, there is an Thanks for this! |
|
Got it! Any concern about changing the order of isomorphism list? |
While it's probably not super important, it's also easy to preserve so we should do so - see my comment above about |
rossbar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amcandio ! I approve this as my comments to this point are nits not blockers and can be addressed separately.
dschult
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve this as well.
Convert
generate_isomorphismto an Iterative ImplementationI was looking at tree isomorphism logic when reviewing #7929 and realized implementation was using a recursive approach to compute the node mapping. This can be problematic with deep trees because it can lead to
RecursionErrorexceptions.@rossbar since you are already working on this module please let me know if you are already working on this particular improvement. Happy to drop this PR!
Summary
This PR refactors the
generate_isomorphismfunction to replace its recursive implementation with an iterative approach using an explicit stack. The change is intended to prevent potential stack overflow issues when dealing with deeply nested trees (e.g., long path graphs).Changes Made
generate_isomorphismfrom a recursive function to an iterative one using a stack.Reason for Change
The recursive implementation could cause a
RecursionErrorfor deep trees, such as long path graphs. This issue was observed in test cases with large trees (e.g.,test_long_paths_graphs). By converting the function to an iterative version, we avoid excessive recursion depth and improve performance for large inputs.Implementation Details
Previous recursive version:
New iterative version:
Testing