Implement binary search on atomic subintervals#107
Conversation
5c9676a to
398f6a4
Compare
|
Hmm, I can probably write up a kludge of some kind for 3.9, but FWIW, 3.9 is EOL https://endoflife.date/python so maybe it can be dropped? |
Went EOL on 2025-10-31.
|
Also...slightly tempted to try using galloping search for some more pathological cases (e.g. but I don't want to make this more complicated/bug-prone than it needs to be right now. |
When Interval.overlaps, __and__, and __contains__ iterate the atomic subintervals, they use linear search to find the start point for candidate intersection. But since the subintervals are already disjoint and sorted, binary search can be used instead. Fixes AlexandreDecan#106.
398f6a4 to
ad884bb
Compare
I prefer keeping support for 3.9 as much as possible. What is the problem exactly? Also, could you do some benchmarks so we can actually check whether the extra "complexity" is worth it? Thanks. |
|
For 3.9, it's because bisect doesn't take a Possible workarounds I can think of:
Re: benchmarks, how do you want them added/run? There's only pass/fail unit tests at the moment. |
|
Thanks. I'll check whether there are some backports for 3.9, would be good to keep compatibility. Otherwise, an option would be to have two distinct code paths but I tend to dislike this idea :-) For the benchmarks, I mostly want to be sure that we are not adding any overhead for smaller cases. Informal benchmarks are enough (and if I remember correctly we already have some code living in some issue/pr for that, I'll try to find it and will post here) |
|
I can't find it but we can easily redo it or even use copilot for that. |
|
FWIW, as an ad-hoc test: Before change: After change: |
|
Thanks. I see little convincing options for 3.9, perhaps dropping it is the "best" we can do given it's eol, as you said. |
|
Would it make sense to integrate the early-out optimisation directly in the iterator? |
|
I can add all-or-nothing optimization for It's tricky though. I think keeping the optimization still makes sense for But, if I leave the early-out optimizations in, I don't think the |
|
Indeed, thanks. I'm likely to merge this pr "soon", I just need to check it on something else than my phone before :-D (edited: likely this Sunday, not sure I'll have time for it today) |
| - Improve performance of `Interval` creation and union for large disjunctions of overlapping intervals. | ||
| - Improve performance of `Interval.__contains__` for values. | ||
| - Drop official support for Python 3.9. | ||
| - Improve performance of `Interval.overlaps`, `__and__`, and `__contains__` for large, complex intervals when applied to small subintervals (see [#106](https://github.com/AlexandreDecan/portion/issues/106)). |
There was a problem hiding this comment.
You may directly refer to this pr instead of the issue, and mention your name if you want (see
Line 66 in 9d6d885
When
Interval.overlaps,__and__, and__contains__iterate the atomic subintervals, they use linear search to find the start point for candidate intersection. But since the subintervals are already disjoint and sorted, binary search can be used instead.Fixes #106.