Skip to content

Commit 57b424f

Browse files
Bechelersdarwin
authored andcommitted
Add Q1 2026 update on Boost.Graph contributions
Documented my contributions to the Boost.Graph library, including the implementation of community detection algorithms and the organization of a workshop to engage the user community.
1 parent 271f9cf commit 57b424f

1 file changed

Lines changed: 104 additions & 0 deletions

File tree

Lines changed: 104 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
---
2+
layout: post
3+
nav-class: dark
4+
categories: arnaud
5+
title: Joining Community, Detecting Communities, Making Community.
6+
author-id: arnaud
7+
---
8+
9+
## Joining a community
10+
11+
Early in Q1 2026, I joined the C++ Alliance. A very exciting moment.
12+
13+
So I began to work early January under Joaquin's mentorship, with the idea of having a clear contribution to Boost Graph by the end of Q1.
14+
After a few days of auditing the current state of the library versus the literature, it became clear that community detection methods
15+
(aka graph clustering algorithms) were sorely lacking for Boost.Graph, and that implementing one would be a great start
16+
to revitalizing the library and fill up maybe the largest methodological gap in its current algorithmic coverage.
17+
18+
## Detecting Communities
19+
20+
The vision was (and still is) simple: i) begin
21+
to implement Louvain algorithm, ii) build upon it to extend to the more complex Leiden algorithm, iii) finally get
22+
started with the Stochastic Block Model.
23+
24+
If the plan is straightforward, the Louvain literature is not, and the BGL abstractions even less.
25+
But under the review and guidance from Joaquin and Jeremy Murphy (maintainer of the BGL), I was able to put up a satisfying implementation:
26+
27+
Using the Newman-Girvan Modularity as the quality function to optimize, one can simply call:
28+
29+
```cpp
30+
double Q = boost::louvain_clustering(
31+
g, cluster_map, weight_map, gen,
32+
boost::newman_and_girvan{}, // quality function (default)
33+
1e-7, // min_improvement_inner (per-pass convergence)
34+
0.0 // min_improvement_outer (cross-level convergence)
35+
);
36+
// Q = 0.42, cluster_map = {0,0,0, 1,1,1}
37+
```
38+
39+
As it happens often with heuristics, there is a large number of quality functions out there, and this is not
40+
because of a lack of consensus: in [a 2002 paper](https://www.cs.cornell.edu/home/kleinber/nips15.pdf),
41+
computer scientist Jon Kleinberg proved that no clustering quality function
42+
(Modularity, Goldberg density, Surprise...) can simultaneously be
43+
i) scale-invariant (doubling all edges should not change the clusters),
44+
ii) rich (all partitions should be achievable),
45+
iii) consistent (shortening distances inside a cluster and expanding distances between clusters should lead to similar results).
46+
47+
In other words, there is no way to implement a single function hoping it would exhibit three basic properties we would genuinely expect.
48+
All we can do is to explore different trade-offs using different quality functions.
49+
50+
So I left some doors open to be able to inject an arbitrary quality function.
51+
If this function exposes a minimal, "naive" interface, the algorithm will statically use a
52+
slow but generic path, and iterate across all the edges of the graph to compute the quality.
53+
It is slow, yes, but it makes the study of qualities easier, as one does not have to figure out
54+
the local mathematical decomposition of the function to get started with coding:
55+
56+
```cpp
57+
struct my_quality {
58+
template <typename G, typename CMap, typename WMap>
59+
typename boost::property_traits<WMap>::value_type
60+
quality(const G& g, const CMap& c, const WMap& w) {
61+
// your custom partition quality function
62+
}
63+
};
64+
65+
double Q = boost::louvain_clustering(g, cluster_map, weight_map, gen, my_quality{});
66+
```
67+
68+
However, the Louvain algorithm is extremely popular because it is fast, as it is able to update the
69+
quality computational state for each vertex it tries to "insert" or "remove" from a neighboring putative community.
70+
This *locality* decomposition has to be figured out mathematically for each quality function, so it's not trivial.
71+
72+
I defined a `GraphPartitionQualityFunctionIncrementalConcept` that refines the `GraphPartitionQualityFunctionConcept` :
73+
if the algorithm detects that the injected quality function exposes an interface for this incremental update,
74+
the fast path is taken. One thing I figured out is that the `GraphPartitionQualityFunctionIncrementalConcept` is for now too specific
75+
to the Modularity family. I am currently working on a proposal to increase its scope in future work.
76+
77+
The current PR has been carefully tested and benchmarked for correctness and performance, and validated by
78+
Jeremy to be merged on develop branch.
79+
80+
I wrote a paper to be submitted to the Journal of Open Source Software to publish the current results and benchmarks,
81+
as we are at least as fast as our competitors, and more generic. There is no equivalent I am aware of.
82+
83+
## Making Community
84+
85+
Concurrently, I worked on summoning the Boost.Graph user base, and it quickly became clear a small local workshop would
86+
be a tremendous start: the Louvain algorithm community is based in Louvain (Belgium), its extension was
87+
formulated in Leiden (Netherlands) and my PhD graphs network is based in Paris (France) in what has been presented to me
88+
as "the Temple of the Stochastic Block Model" ! Quite a sign: life finds ways to run in (tight) circles.
89+
90+
So the goal of this [workshop](https://github.com/boostorg/graph/discussions/466) is to bring together a small group
91+
(10-15 people) of researchers, open-source implementers, and industrial users for
92+
a day of honest conversation on May 6th 2026. Three questions will anchor the discussions:
93+
1. What types of graphs and data structures do you use in practice?
94+
2. What performance, scalability, and interpretability requirements matter most to you?
95+
3. What algorithms are missing today that Boost.Graph could offer?
96+
97+
Ray and Collier from the C++ Alliance will also be there to record the lightning talks and document the process.
98+
It would also be the occasion to show off the python-based animations I put together for the [French C++ User Group
99+
presentation on March 24th](https://www.youtube.com/watch?v=-OVvzRFiYLU).
100+
Those had a nice success and received many compliments, as it pairs well with the visual and
101+
dynamic nature of graphs and their algorithms, and I hope it will contribute
102+
to the repopularization of Boost.Graph.
103+
104+
Graphliiings asseeeeemble !

0 commit comments

Comments
 (0)