|
| 1 | +--- |
| 2 | +layout: post |
| 3 | +nav-class: dark |
| 4 | +categories: arnaud |
| 5 | +title: Joining Community, Detecting Communities, Making Community. |
| 6 | +author-id: arnaud |
| 7 | +--- |
| 8 | + |
| 9 | +## Joining a community |
| 10 | + |
| 11 | +Early in Q1 2026, I joined the C++ Alliance. A very exciting moment. |
| 12 | + |
| 13 | +So I began to work early January under Joaquin's mentorship, with the idea of having a clear contribution to Boost Graph by the end of Q1. |
| 14 | +After a few days of auditing the current state of the library versus the literature, it became clear that community detection methods |
| 15 | +(aka graph clustering algorithms) were sorely lacking for Boost.Graph, and that implementing one would be a great start |
| 16 | +to revitalizing the library and fill up maybe the largest methodological gap in its current algorithmic coverage. |
| 17 | + |
| 18 | +## Detecting Communities |
| 19 | + |
| 20 | +The vision was (and still is) simple: i) begin |
| 21 | +to implement Louvain algorithm, ii) build upon it to extend to the more complex Leiden algorithm, iii) finally get |
| 22 | +started with the Stochastic Block Model. |
| 23 | + |
| 24 | +If the plan is straightforward, the Louvain literature is not, and the BGL abstractions even less. |
| 25 | +But under the review and guidance from Joaquin and Jeremy Murphy (maintainer of the BGL), I was able to put up a satisfying implementation: |
| 26 | + |
| 27 | +Using the Newman-Girvan Modularity as the quality function to optimize, one can simply call: |
| 28 | + |
| 29 | +```cpp |
| 30 | +double Q = boost::louvain_clustering( |
| 31 | + g, cluster_map, weight_map, gen, |
| 32 | + boost::newman_and_girvan{}, // quality function (default) |
| 33 | + 1e-7, // min_improvement_inner (per-pass convergence) |
| 34 | + 0.0 // min_improvement_outer (cross-level convergence) |
| 35 | +); |
| 36 | +// Q = 0.42, cluster_map = {0,0,0, 1,1,1} |
| 37 | +``` |
| 38 | + |
| 39 | +As it happens often with heuristics, there is a large number of quality functions out there, and this is not |
| 40 | +because of a lack of consensus: in [a 2002 paper](https://www.cs.cornell.edu/home/kleinber/nips15.pdf), |
| 41 | +computer scientist Jon Kleinberg proved that no clustering quality function |
| 42 | +(Modularity, Goldberg density, Surprise...) can simultaneously be |
| 43 | +i) scale-invariant (doubling all edges should not change the clusters), |
| 44 | +ii) rich (all partitions should be achievable), |
| 45 | +iii) consistent (shortening distances inside a cluster and expanding distances between clusters should lead to similar results). |
| 46 | + |
| 47 | +In other words, there is no way to implement a single function hoping it would exhibit three basic properties we would genuinely expect. |
| 48 | +All we can do is to explore different trade-offs using different quality functions. |
| 49 | + |
| 50 | +So I left some doors open to be able to inject an arbitrary quality function. |
| 51 | +If this function exposes a minimal, "naive" interface, the algorithm will statically use a |
| 52 | +slow but generic path, and iterate across all the edges of the graph to compute the quality. |
| 53 | +It is slow, yes, but it makes the study of qualities easier, as one does not have to figure out |
| 54 | +the local mathematical decomposition of the function to get started with coding: |
| 55 | + |
| 56 | +```cpp |
| 57 | +struct my_quality { |
| 58 | + template <typename G, typename CMap, typename WMap> |
| 59 | + typename boost::property_traits<WMap>::value_type |
| 60 | + quality(const G& g, const CMap& c, const WMap& w) { |
| 61 | + // your custom partition quality function |
| 62 | + } |
| 63 | +}; |
| 64 | + |
| 65 | +double Q = boost::louvain_clustering(g, cluster_map, weight_map, gen, my_quality{}); |
| 66 | +``` |
| 67 | +
|
| 68 | +However, the Louvain algorithm is extremely popular because it is fast, as it is able to update the |
| 69 | +quality computational state for each vertex it tries to "insert" or "remove" from a neighboring putative community. |
| 70 | +This *locality* decomposition has to be figured out mathematically for each quality function, so it's not trivial. |
| 71 | +
|
| 72 | +I defined a `GraphPartitionQualityFunctionIncrementalConcept` that refines the `GraphPartitionQualityFunctionConcept` : |
| 73 | +if the algorithm detects that the injected quality function exposes an interface for this incremental update, |
| 74 | +the fast path is taken. One thing I figured out is that the `GraphPartitionQualityFunctionIncrementalConcept` is for now too specific |
| 75 | +to the Modularity family. I am currently working on a proposal to increase its scope in future work. |
| 76 | +
|
| 77 | +The current PR has been carefully tested and benchmarked for correctness and performance, and validated by |
| 78 | +Jeremy to be merged on develop branch. |
| 79 | +
|
| 80 | +I wrote a paper to be submitted to the Journal of Open Source Software to publish the current results and benchmarks, |
| 81 | +as we are at least as fast as our competitors, and more generic. There is no equivalent I am aware of. |
| 82 | +
|
| 83 | +## Making Community |
| 84 | +
|
| 85 | +Concurrently, I worked on summoning the Boost.Graph user base, and it quickly became clear a small local workshop would |
| 86 | +be a tremendous start: the Louvain algorithm community is based in Louvain (Belgium), its extension was |
| 87 | +formulated in Leiden (Netherlands) and my PhD graphs network is based in Paris (France) in what has been presented to me |
| 88 | +as "the Temple of the Stochastic Block Model" ! Quite a sign: life finds ways to run in (tight) circles. |
| 89 | +
|
| 90 | +So the goal of this [workshop](https://github.com/boostorg/graph/discussions/466) is to bring together a small group |
| 91 | +(10-15 people) of researchers, open-source implementers, and industrial users for |
| 92 | +a day of honest conversation on May 6th 2026. Three questions will anchor the discussions: |
| 93 | +1. What types of graphs and data structures do you use in practice? |
| 94 | +2. What performance, scalability, and interpretability requirements matter most to you? |
| 95 | +3. What algorithms are missing today that Boost.Graph could offer? |
| 96 | +
|
| 97 | +Ray and Collier from the C++ Alliance will also be there to record the lightning talks and document the process. |
| 98 | +It would also be the occasion to show off the python-based animations I put together for the [French C++ User Group |
| 99 | +presentation on March 24th](https://www.youtube.com/watch?v=-OVvzRFiYLU). |
| 100 | +Those had a nice success and received many compliments, as it pairs well with the visual and |
| 101 | +dynamic nature of graphs and their algorithms, and I hope it will contribute |
| 102 | +to the repopularization of Boost.Graph. |
| 103 | +
|
| 104 | +Graphliiings asseeeeemble ! |
0 commit comments