-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Harmonic diameter #5251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Harmonic diameter #5251
Conversation
dschult
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this. It looks well implemented.
Can you add the function name to doc/reference/algorithms/distance_measures.rst
so the sphinx auto-documentation builder can find the function.
Then you can check the documentation in the details of the Circle-CI "artifact" test.
:} Thanks!
|
Aha! I was wondering how to see the generated docs, indeed. |
|
Well, line added, test completed, but under "Artifacts" I have absolutely nothing. 🤷🏻♂️ |
|
Ah OK, it finally appeared. I had to be patient. Docs are OK. |
|
If there is interest, I might commit harmonic eccentricity, too, which follows the same idea (harmonic mean of distances from x to all other nodes). |
dschult
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me.
Thanks for this!
Harmonic Eccentricity is harder for me to find in the literature. And it doesn't seem to be in the reference you cite in this PR. Can you point to some literature?
|
Actually, no, I can't, it's a fairly obvious "harmonic" reworking of eccentricity. You can use harmonic means in closeness, and you get harmonic centrality, in diameter computation, and you get harmonic diameter, or in eccentricity computation, and you get harmonic eccentricity. Harmonic diameter is the harmonic mean of harmonic eccentricities, exactly like diameter is the maximum of eccentricities. But it might be premature. I should write a paper about this :). |
|
Hee hee... Yes! It sounds like it is time to write a paper! :} |
| sum_invd = 0 | ||
| for n in G: | ||
| if sp is None: | ||
| length = nx.single_source_shortest_path_length(G, n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would not work for weighted graphs, is harmonic_diameter defined for a weighted graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah OK I see I've got confused between single_source_shortest_path_length and shortest_path_length. The implementation as such is correct for distances defined as shortest paths. You can use another notion of distance—nothing prevents you from taking the harmonic mean. I thought that NetworkX would automatically "understand" whether a graph was weighted and compute distances accordingly (as diameter and eccentricity are already implemented in that way), but now I'm seeing that, say, harmonic_centrality has an explicitly distance parameter.
It seems that different people implemented different APIs in NetworkX for using a non-standard metric: it is passed via a dictionary of distances in diameter, and via a property used to detect edge weights in harmonic_centrality.
I'd stick with the current implementation as it mimics that of diameter, but you guys know better than me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this module (distance_measures.py) could use a pass through to update all functions to allow weighted graphs. The definition should be straightforward, following from the definition in shortest_path.
I'll open an issue for this suggestion separate from this PR. (see #5257 )
|
Is it worth adding a brief |
|
I thought I had found Harmonic Diameter discussed on Wikipedia. But apparently not. Do you have a better reference for harmonic diameter? |
|
Yeah, I was dubious whether to add a note. I did now. In the paper the harmonic diameter is called "connectivity length". But the name "harmonic diameter" is standard in the theory of metric spaces for the same kind of harmonic mean. "If we want to quantify the typical separation between two vertices in the graph, a good measure is You can see the metric-space definition at p. 4 of https://www.researchgate.net/profile/Huse_Fatkic/project/The-Theory-of-Fractional-Market-Shares/attachment/57c0699508aec1ae131051f8/AS:399366213914624@1472227733721/download/Huse+Fatkic%2C+Slobodan+Sekulovic%2C+Fatih+Destovic%2C+and+Hana++Fatkic%2C+June+6%2C++2014.pdf?context=ProjectUpdatesLog |
|
My main concern with using diameter is that I've only seen it in the context of a max. This doesn't seem to be using a max. The definition of harmonic diameter in the paper you link to, they do take a max. |
|
I guess the max sort of goes away (since there is just one element in the set you are taking the max of) when k = |V| as in your case. |
|
Yes. So, you could define the standard diameter as the maximum distance between k points for every choice of k points and let k go to infinity, as they do for the harmonic diameter. But that would overkill, because in the case of the standard diameter after k=2 you don't get a different value. For the harmonic diameter things are a bit more complicated (and interesting). The harmonic diameter (in continuous geometric) is obtained by maximizing over the harmonic mean of all distances between all subsets of k points, and then letting k go to infinity. But for graphs, the maximum value is attained for k equal to the number of vertices—that is, by taking the harmonic mean of distances between all nodes. So the harmonic diameter defined in the documentation (the harmonic mean of all distances) is the result of applying the general definition to graphs. Marchiori and Latora got exactly the same definition using a completely different motivation. In the classic (and monumental) survey by Newman, "The Structure and Function of Complex Networks", harmonic diameter is commented as follows: "An alternative and perhaps more satisfactory approach is to define 𝓁� to be the “harmonic mean” geodesic distance between all pairs, i.e., the reciprocal of the average of the reciprocals. [...] This approach has been adopted only occasionally in network calculations [259], but perhaps should be used more often." |
|
So, if understand the discussion, this function computes a kind of average path length where by average we use the harmonic mean instead of the arithmetic mean. What do people think of adding this feature by adding a keyword argument to the |
|
Incidentally, that function is really weird. It states a formula but does not compute that formula. If I apply it to a directed graph with a single arc 0→1, it returns 1/2, but the correct value is ∞. From the code, it appears that the actual formula computed is the sum of finite lengths, divided by n/(n-1). This does not make any sense if the graph is not strongly connected, as you're adding a certain number of lengths, and then dividing by a different amount. My guess is that whoever put the check for "weakly connected" did not understand that the test had to be for "strongly connected". Then the code makes sense because if there are non-reachable pairs there will be an exception, so the mean will again make sense. Personally, I'd prefer the harmonic diameter code to be distinct from this—it really needs fixing. |
|
Sorry—is there anything else I have to do? Or is this going to be included in NetworkX at some point? |
|
OK, so now this is conflicting. Should I make the effort of solving the (trivial) conflicts or are you guys not interested in this new code? |
|
Incidentally, I realized I'm using the term probably because I found it in this 2003 paper by Fogaras: Fogaras, D. (2003). Where to Start Browsing the Web?. In: Böhme, T., Heyer, G., Unger, H. (eds) Innovative Internet Community Systems. IICS 2003. Lecture Notes in Computer Science, vol 2877. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39884-4_6 Section 3.2 says “The connectivity is expressed by the harmonic diameter of the Web graph, the harmonic mean of distances between all the pairs of nodes.” |
|
I see the milestone have been pushed to 3.1. This does not seem such a difficult PR—one year no hold? 😂 Here are some other papers computing the harmonic diameter on networks: https://dl.acm.org/doi/10.1145/3132847.3133087 |
|
Sorry about the long delay. We should be able to get this going again. The main issue seemed to be the name I'd like to propose the name |
|
Well, there's a lot of literature talking about the "effective diameter" which is the 90th percentile of distances, even if it not a diameter. And as remarked above in functional analysis they say "harmonic diameter", not "harmonic mean of distances". I added a few papers above to show where it is used. It's not exactly commonplace, but it's fairly common among people taking measurements on large networks. It is possible to add it to |
dschult
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a decision and get this merged. This review proposes language to describe a little more about some hesitations for the name, but in the end it leaves the name harmonic_diameter.
It would be good to eventually make this optionally depend on edge weights using a keyword argument such as "weight=None". But let's do that in another PR. See issue #7624.
I think that resolves all the discussion in this PR about the function and it is ready to merge.
Co-authored-by: Dan Schult <dschult@colgate.edu>
|
Thanks @vigna !! |
This PR implements the harmonic diameter, a measure of connectivity.