Skip to content

Commit 0b93416

Browse files
committed
Add a new article 'Collider'
1 parent 333b229 commit 0b93416

18 files changed

Lines changed: 529 additions & 0 deletions

blog/collider.md

Lines changed: 264 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,264 @@
1+
---
2+
title: "To the Collider!"
3+
date: 2026-03-05
4+
description: "Benchmarking isn't hard. One attribute, one run, and you know exactly what's faster. Putting it to practice with Testo."
5+
image: /blog/collider/img-00.jpg
6+
author: Aleksei Gagarin
7+
---
8+
9+
::: info 🤔 The Problem
10+
Write a function that calculates the sum of all numbers from `$a` to `$b`.
11+
12+
For example, if `$a = 1` and `$b = 5`, the result is `1 + 2 + 3 + 4 + 5 = 15`.
13+
:::
14+
15+
1. The simplest solution that comes to mind:
16+
iteratively add `$i` to `$a` in a `for` loop until we reach `$b`, but that's too obvious.
17+
2. Imagination takes over and you want to solve it with arrays:
18+
fill an array with values from `$a` to `$b` and pass it to `sum()`.
19+
20+
![To the Collider!](/blog/collider/img-01.png)
21+
22+
## Comparing the Solutions
23+
24+
PHP performs very well in synthetic benchmarks, outpacing Python and all that. With JIT enabled and a few tricks, it can even catch up to C++.
25+
26+
We won't be comparing PHP with other languages right now — instead, let's just compare these two approaches against each other.
27+
28+
Here's my reasoning:
29+
30+
> The array solution should be slower than the `for` loop, since extra resources go into computing hashes for the hash table when creating the array, and more memory is needed for intermediate values.
31+
32+
Let's verify this: we'll write the functions and add the `#[BenchWith]` attribute to one of them.
33+
34+
```php
35+
#[BenchWith(
36+
callables: [
37+
'in_array' => [self::class, 'sumInArray'],
38+
],
39+
arguments: [1, 5_000],
40+
calls: 100,
41+
iterations: 1,
42+
)]
43+
public static function sumInCycle(int $a, int $b): int
44+
{
45+
$result = 0;
46+
for ($i = $a; $i <= $b; ++$i) {
47+
$result += $i;
48+
}
49+
50+
return $result;
51+
}
52+
53+
public static function sumInArray(int $a, int $b): int
54+
{
55+
return \array_sum(\range($a, $b));
56+
}
57+
```
58+
59+
With the `#[BenchWith]` attribute, we're telling Testo that:
60+
- we want to compare the performance of the current function (`sumInCycle`) with another function (`sumInArray`);
61+
- both functions will receive the same arguments: `1` and `5_000`;
62+
- to measure execution time, each function will be called 100 times in a row (`calls: 100`).
63+
64+
Place your bets and let's run it.
65+
66+
```
67+
Summary:
68+
+---+-------------+-------+-------+--------+-------------------+
69+
| # | Name | Iters | Calls | Memory | Avg Time |
70+
+---+-------------+-------+-------+--------+-------------------+
71+
| 2 | current | 1 | 100 | 0 | 38.921ms |
72+
| 1 | sumInArray | 1 | 100 | 0 | 21.472ms (-44.8%) |
73+
+---+-------------+-------+-------+--------+-------------------+
74+
```
75+
76+
`sumInArray` takes first place, completing the task almost twice as fast as `sumInCycle`.
77+
78+
> Wait, what? The array-based function won by a wide margin?!
79+
80+
81+
![Statistical Artifact](/blog/collider/img-02.png)
82+
83+
## Statistical Artifact
84+
85+
Indeed, this could just be a "statistical artifact."
86+
87+
Each benchmark rerun produces different results, sometimes varying significantly from the previous ones.
88+
This can be caused by background tasks, user activity, or other phenomena that affect performance at the moment.
89+
90+
::: warning ⚠️ We need guarantees that we're comparing genuinely stable results, not just random outliers.
91+
:::
92+
93+
Statistics comes to the rescue with the [coefficient of variation](https://en.wikipedia.org/wiki/Coefficient_of_variation), which measures the relative variability of data.
94+
The smaller this coefficient, the more stable the results.
95+
96+
All we need to do is collect more data spread over time — that is, rerun the benchmarks multiple times.
97+
The `#[BenchWith]` attribute has an `iterations` parameter responsible for the number of benchmark reruns.
98+
99+
Let's set `iterations: 10` and rerun:
100+
101+
```
102+
Summary:
103+
+---+-------------+-------+-------+--------+-------------------+---------+
104+
| # | Name | Iters | Calls | Memory | Avg Time | RStDev |
105+
+---+-------------+-------+-------+--------+-------------------+---------+
106+
| 2 | current | 10 | 100 | 0 | 38.474ms | ±2.86% |
107+
| 1 | sumInArray | 10 | 100 | 0 | 12.501ms (-67.5%) | ±27.20% |
108+
+---+-------------+-------+-------+--------+-------------------+---------+
109+
```
110+
111+
Now `sumInArray` runs 3x faster, but the coefficient of variation (`RStDev` column) is 27.2%, which is quite high.
112+
To claim stable results, you typically aim for RStDev < 2%.
113+
114+
Let's think about how to reduce this variation. Our functions execute quite fast, and even small performance fluctuations can heavily impact the results, especially with a low number of runs.
115+
For fast code like ours, increasing the number of `calls` per iteration can help. Let's bump it to 2000:
116+
117+
```
118+
Summary:
119+
+---+-------------+-------+-------+--------+-------------------+--------+
120+
| # | Name | Iters | Calls | Memory | Avg Time | RStDev |
121+
+---+-------------+-------+-------+--------+-------------------+--------+
122+
| 2 | current | 10 | 2000 | 0 | 37.888ms | ±1.38% |
123+
| 1 | sumInArray | 10 | 2000 | 0 | 11.395ms (-69.9%) | ±1.72% |
124+
+---+-------------+-------+-------+--------+-------------------+--------+
125+
```
126+
127+
As you can see, with a 3x performance difference, even ±27.2% variation wouldn't have saved the `for` loop from defeat. But now we can confidently claim the results are stable (RStDev < 2%).
128+
129+
![Fine, arrays are faster](/blog/collider/img-03.png)
130+
131+
132+
By the way, did you notice that `memory=0` in both cases? This means no additional memory was allocated for the arrays — what was already allocated at benchmark startup was enough.
133+
134+
Of course, you could experiment with a larger range, enable JIT, and prove that in some cases the loop would be faster,
135+
but I want to draw your attention to how quick it is to benchmark something now!
136+
137+
138+
## BenchWith
139+
140+
![Shock](/blog/collider/img-04.png)
141+
142+
Benchmarking right in your code without extra boilerplate. Like [inline tests](/docs/inline-tests.md), but for benchmarks.
143+
144+
[Dragon Code](https://github.com/TheDragonCode/benchmark) once showed that benchmarks can be simple and convenient: instead of tons of boilerplate, just call a single class and pass closures for comparison.
145+
Testo takes this to the next level: from intent to result in just one attribute.
146+
147+
Run it with a single click in your IDE:
148+
![IDE](/blog/collider/screen.png)
149+
150+
But that's not all. Behind the simplicity on the surface lie serious algorithms backed by statistics.
151+
152+
Testo automatically detects deviations in the data, discards outliers, and produces metrics that help you understand how stable the results are.
153+
For those who don't find raw numbers very telling, there's a summary with recommendations and alerts.
154+
155+
Here's what it looks like right now:
156+
157+
```
158+
Results for calcFoo:
159+
+----------------------------+-------------------------------------------------+------------------------------------+--------------------------------------------------------------+
160+
| BENCHMARK SETUP | TIME RESULTS | FILTERED RESULTS | SUMMARY |
161+
| Name | Iters | Calls | Mean | Median | RStDev | Rej. | Mean* | RStDev* | Place | Warnings |
162+
+------------+-------+-------+-------------------+-------------------+---------+------+-------------------+---------+-------+------------------------------------------------------+
163+
| current | 10 | 20 | 44.03µs | 43.68µs | ±2.35% | 1 | 43.69µs | ±0.42% | 3rd | |
164+
| calcBar | 10 | 20 | 13.72µs (-68.8%) | 13.26µs (-69.6%) | ±7.77% | 2 | 13.23µs (-69.7%) | ±0.52% | 2nd | |
165+
| calcBaz | 10 | 20 | 110.50ns (-99.7%) | 105.00ns (-99.8%) | ±16.50% | 1 | 106.11ns (-99.8%) | ±12.52% | 1st | High variance, low iter time. Insufficient iter time |
166+
+------------+-------+-------+-------------------+-------------------+---------+------+-------------------+---------+-------+------------------------------------------------------+
167+
Recommendations:
168+
⚠ High variance, low iter time: Measurement overhead may dominate — increase calls per iteration.
169+
⚠ Insufficient iter time: Timer jitter exceeds useful signal — increase calls per iteration.
170+
```
171+
172+
I know, it looks overwhelming, but this isn't the release version yet. In the future I envision it being even simpler: an attribute with automatic settings, no need to dive into the details.
173+
174+
## Back to the Collider
175+
176+
Alright, you obviously know that the range sum problem can be solved in `O(1)` using a simple mathematical formula.
177+
I'll deprive you of the pleasure of pointing this out in the comments.
178+
179+
Here's the function and a benchmark against the previous solutions:
180+
181+
```php
182+
public static function sumLinearF(int $a, int $b): int
183+
{
184+
$d = $b - $a + 1;
185+
return (int) (($d - 1) * $d / 2) + $a * $d;
186+
}
187+
```
188+
189+
```
190+
Summary:
191+
+---+-------------+-------+-------+-------------------+--------+
192+
| # | Name | Iters | Calls | Avg Time | RStDev |
193+
+---+-------------+-------+-------+-------------------+--------+
194+
| 4 | current | 10 | 2000 | 40.102ms | ±1.09% |
195+
| 2 | sumInArray | 10 | 2000 | 12.232ms (-69.5%) | ±0.93% |
196+
| 1 | sumLinear | 10 | 2000 | 77.065µs (-99.8%) | ±3.05% |
197+
+---+-------------+-------+-------+-------------------+--------+
198+
```
199+
200+
Microseconds instead of milliseconds. Pretty cool, right?
201+
202+
And even here there's room for optimization. You've probably heard that division isn't always the fastest operation.
203+
Dividing by 2 can be replaced with multiplying by 0.5.
204+
205+
![Multiplication is faster!](/blog/collider/img-05.png)
206+
207+
```php
208+
public static function multi(int $a, int $b): int
209+
{
210+
$d = $b - $a + 1;
211+
return (int) (($d - 1) * $d * 0.5) + $a * $d;
212+
}
213+
```
214+
215+
```
216+
+---+---------+-------+---------+--------+------------------+--------+
217+
| # | Name | Iters | Calls | Memory | Avg Time | RStDev |
218+
+---+---------+-------+---------+--------+------------------+--------+
219+
| 1 | current | 10 | 2000000 | 0 | 75.890µs | ±0.79% |
220+
| 2 | multi | 10 | 2000000 | 0 | 78.821µs (+3.9%) | ±0.47% |
221+
+---+---------+-------+---------+--------+------------------+--------+
222+
```
223+
224+
225+
::: info Division is faster than multiplication (╯°□°)╯︵ ┻━━┻
226+
227+
Expectations don't always match reality, and optimizations don't always work the way we think.
228+
:::
229+
230+
Also, remembering that we're working with positive integers in binary, we can replace division with a bit shift, which in theory should be even faster.
231+
232+
![WTF](/blog/collider/img-06.png)
233+
234+
235+
```php
236+
public static function shift(int $a, int $b): int
237+
{
238+
$d = $b - $a + 1;
239+
return ((($d - 1) * $d) >> 1) + $a * $d;
240+
}
241+
```
242+
243+
```
244+
+---+---------+-------+---------+--------+------------------+--------+
245+
| # | Name | Iters | Calls | Memory | Avg Time | RStDev |
246+
+---+---------+-------+---------+--------+------------------+--------+
247+
| 2 | current | 10 | 2000000 | 0 | 75.890µs | ±0.79% |
248+
| 1 | shift | 10 | 2000000 | 0 | 70.559µs (-7.0%) | ±0.70% |
249+
+---+---------+-------+---------+--------+------------------+--------+
250+
```
251+
252+
At least the bit shift didn't let us down.
253+
254+
Note that the 7% improvement doesn't mean bit shifting is exactly 7% faster than division.
255+
The function contains several other mathematical operations, and the function call itself takes some time.
256+
So 7% is the difference between two functions, not between two specific operations.
257+
258+
::: info 💡 It's always important to understand what exactly is being compared, so you can correctly interpret the results.
259+
:::
260+
261+
![Benchmarks will tell](/blog/collider/img-07.png)
262+
263+
264+
Use benchmarks, verify your assumptions, and find optimal solutions.

public/blog/collider/img-00.jpg

817 KB
Loading

public/blog/collider/img-01-ru.png

171 KB
Loading

public/blog/collider/img-01.png

172 KB
Loading

public/blog/collider/img-02-ru.png

125 KB
Loading

public/blog/collider/img-02.png

124 KB
Loading

public/blog/collider/img-03-ru.png

87.3 KB
Loading

public/blog/collider/img-03.png

87.4 KB
Loading

public/blog/collider/img-04-ru.png

223 KB
Loading

public/blog/collider/img-04.png

224 KB
Loading

0 commit comments

Comments
 (0)