Skip to content

Conversation

@amcandio
Copy link
Contributor

@amcandio amcandio commented Mar 29, 2025

Convert generate_isomorphism to an Iterative Implementation

I was looking at tree isomorphism logic when reviewing #7929 and realized implementation was using a recursive approach to compute the node mapping. This can be problematic with deep trees because it can lead to RecursionError exceptions.

@rossbar since you are already working on this module please let me know if you are already working on this particular improvement. Happy to drop this PR!

Summary

This PR refactors the generate_isomorphism function to replace its recursive implementation with an iterative approach using an explicit stack. The change is intended to prevent potential stack overflow issues when dealing with deeply nested trees (e.g., long path graphs).

Changes Made

  • Converted generate_isomorphism from a recursive function to an iterative one using a stack.
  • Ensured that child nodes are pushed onto the stack in reverse order to maintain the original left-to-right processing order.
  • Removed the previous recursive implementation.

Reason for Change

The recursive implementation could cause a RecursionError for deep trees, such as long path graphs. This issue was observed in test cases with large trees (e.g., test_long_paths_graphs). By converting the function to an iterative version, we avoid excessive recursion depth and improve performance for large inputs.

Implementation Details

Previous recursive version:

def generate_isomorphism(v, w, M, ordered_children):  
    assert v < w  
    M.append((v, w))  
    for x, y in zip(ordered_children[v], ordered_children[w]):  
        generate_isomorphism(x, y, M, ordered_children) 

New iterative version:

def generate_isomorphism(v, w, M, ordered_children):  
    stack = [(v, w)]  
    while stack:  
        curr_v, curr_w = stack.pop()  
        assert curr_v < curr_w  
        M.append((curr_v, curr_w))  
        stack.extend(  
            zip(reversed(ordered_children[curr_v]), reversed(ordered_children[curr_w]))  
        )  

Testing

  • Verified that all existing test cases pass
  • Added a new test test_long_paths_graphs to validate two long path graphs
  • Ensured that tree traversal order remains unchanged.

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amcandio - switching from recursive->iterative implementation is indeed an improvement as the test indicates!

Just a couple suggestions for minor (potential) improvements.

Also since this "helper function" is only used in one place, we could also consider removing it and just moving the implementation to the appropriate location inside rooted_tree_isomorphism. But that can be handled/proposed in a follow-up PR if you prefer to focus on these changes here!

Comment on lines +98 to +107
# Start with the initial pair in the stack
stack = [(v, w)]
while stack:
curr_v, curr_w = stack.pop()
assert curr_v < curr_w
M.append((curr_v, curr_w))
# Zip children and push them in reverse order to maintain processing order.
stack.extend(
zip(reversed(ordered_children[curr_v]), reversed(ordered_children[curr_w]))
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also +1 for switching to an iterative implementation. Another minor thought though: having to reverse the children every time is cumbersome. Perhaps a deque with popleft would do the trick?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reverse is needed so my change doesn't modify any previous behavior (i.e. isomorphism mapping is a list and I'm maintaining the order). But if people is okay with changing that I can replace it with a nicer BFS implementation like:

def generate_isomorphism(v, w, ordered_children):
    ret = [(v, w)]
    next_index = 0
    while next_index < len(ret):
        curr_v, curr_w = ret[next_index]
        assert curr_v < curr_w
        next_index += 1
        ret.extend(zip(ordered_children[curr_v], ordered_children[curr_w]))
    return ret

To be honest, I don't think any one should depend on the order of the isomorphism list but you never know

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was you can do away with the reversed by using a deque to implement the stack instead of a list - here's a quick example:

>>> from collections import deque
>>> lstack = []  # the way it is now
>>> lstack.extend(reversed(range(3)))
>>> while lstack:
...     print(lstack.pop())
0
1
2
# Now, with a deque
>>> dstack = deque()
>>> dstack.extend(range(3))  # extend with objects, no reversal
>>> while dstack:
...     print(dstack.popleft())  # Use pop-left to implement FIFO
0
1
2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that previous recursive implementation naturally does a DFS which is LIFO. The deque will make the algo do a BFS which will change the output order.

To avoid the reversal we would need a stack of dequeues so we do LIFO on the different levels of the tree and FIFO on nodes with the same parent:

from collections import deque

stack = [deque([(v, w)])]

while stack:
    last = stack[-1]
    if not last:
        stack.pop()
        continue
    curr_v, curr_w = last.popleft()
    M.append((curr_v, curr_w))
    stack.append(deque(zip(ordered_children[curr_v], ordered_children[curr_w]))

I can make this change but might be better to drop the order backwards compatibility and simplify the logic

Copy link
Member

@dschult dschult Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about?
[Edit: It's actually slower and has to create a list in the middle -- so probably more memory.... ]
Nevermind! :{

    while stack:  
        curr_v, curr_w = stack.pop()  
        M.append((curr_v, curr_w))  
        stack.extend(  
            reversed(list(zip(ordered_children[curr_v], ordered_children[curr_w])))  
        )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought about that and didn't do it bc of the extra memory. Also reversed(zip(...)) throws TypeError:

In [60]: reversed(zip([1,2,3], [1,2,3]))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [60], line 1
----> 1 reversed(zip([1,2,3], [1,2,3]))

TypeError: 'zip' object is not reversible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, I see what you mean now - I missed that there are two structures defined outside the scope of the function that are being accessed inside (all the more reason to get rid of the helper function definition IMO!)

In that case, I think it's slightly cleaner if we instead move the reversal to the sorted call on L193 where the ordered_children is originally populated. It's probably also worth adding a comment along with the reverse=True to explain why, something like: # reverse=True to preserve DFS order - see gh-7945.

Of course - this is still a nit, not a blocker!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a good idea. Will address in a follow-up PR

Making test_long_paths_graphs more performant

Co-authored-by: Ross Barnowski <rossbar@caltech.edu>
@amcandio
Copy link
Contributor Author

Thanks @amcandio - switching from recursive->iterative implementation is indeed an improvement as the test indicates!

Just a couple suggestions for minor (potential) improvements.

Also since this "helper function" is only used in one place, we could also consider removing it and just moving the implementation to the appropriate location inside rooted_tree_isomorphism. But that can be handled/proposed in a follow-up PR if you prefer to focus on these changes here!

Thanks! By this you mean just declaring the function inside of rooted_tree_isomorphism definition? Are we okay with potentially breaking backwards compatibility? (I doubt anyone uses this method though)

@amcandio
Copy link
Contributor Author

Replied to comments, I think they make sense. Is it okay if I take care of them in a separate PR?

@dschult
Copy link
Member

dschult commented Mar 31, 2025

I think he meant: just move the guts of the generate_isomorphism directly into the function where it is currently called. No need to enclose it in a function definition or call that function.

Also, this helper function is not part of the public API -- it is just old enough that it doesn't have a leading underscore. So we are not worried about users who might be using it directly.

Finally, there is an assert statement in that code that shouldn't be needed (especially if the code is moved inside the function where the children are created). Could you remove that line?

Thanks for this!

@amcandio
Copy link
Contributor Author

Got it! Any concern about changing the order of isomorphism list?

@rossbar
Copy link
Contributor

rossbar commented Mar 31, 2025

Got it! Any concern about changing the order of isomorphism list?

While it's probably not super important, it's also easy to preserve so we should do so - see my comment above about deque!

Copy link
Contributor

@rossbar rossbar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amcandio ! I approve this as my comments to this point are nits not blockers and can be addressed separately.

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve this as well.

@dschult dschult merged commit 44d5471 into networkx:main Apr 1, 2025
39 checks passed
@amcandio amcandio deleted the root-tree-isomorphism-iterative branch April 9, 2025 03:45
@jarrodmillman jarrodmillman added this to the 3.5 milestone May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants