Skip to content

Commit 17f12c6

Browse files
committed
Add a TODO for removing FutureWarning path once blosc2.open() defaults to mode='r'.
1 parent f8021ae commit 17f12c6

2 files changed

Lines changed: 342 additions & 0 deletions

File tree

Lines changed: 341 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,341 @@
1+
# Plan For Changing `blosc2.open()` Default Mode To Read-Only
2+
3+
## Goal
4+
5+
Change the default mode for `blosc2.open(...)` from `"a"` to `"r"` so that
6+
opening an existing object is non-mutating and unsurprising by default.
7+
8+
The change should:
9+
10+
- reduce accidental write access
11+
- avoid implicit unpack / rewrite work for store-backed containers
12+
- align with user expectations for a generic `open(...)` API
13+
- preserve a smooth migration path for existing code that relied on writable
14+
opens without an explicit `mode=`
15+
16+
This plan is for later consideration and rollout design. It does not assume
17+
that the change should land immediately.
18+
19+
## Motivation
20+
21+
Today, `blosc2.open(...)` defaults to `"a"` in
22+
[src/blosc2/schunk.py](/Users/faltet/blosc/python-blosc2/src/blosc2/schunk.py).
23+
24+
That means:
25+
26+
- opening a `.b2z` store without `mode=` may create a writable working copy
27+
- append-mode store opens may unpack zip-backed stores into a temporary working
28+
directory immediately
29+
- code that only intends to inspect metadata or query data can still enter a
30+
mutation-capable path by accident
31+
32+
This is especially surprising for:
33+
34+
- `TreeStore`
35+
- `DictStore`
36+
- `CTable`
37+
- other container-like objects opened through the generic dispatcher
38+
39+
By contrast, users generally expect a bare `open(path)` call to be safe for
40+
inspection unless they explicitly request write access.
41+
42+
## Current Situation
43+
44+
### Default values today
45+
46+
The following default to `"a"` today:
47+
48+
- `blosc2.open(...)`
49+
- `DictStore(...)`
50+
- `TreeStore(...)`
51+
- `CTable(...)` constructor when opening/creating through `urlpath`
52+
53+
At the same time:
54+
55+
- `CTable.open(...)` already defaults to `"r"`
56+
57+
This creates an inconsistency where:
58+
59+
- `blosc2.open("table.b2z")` is writable by default
60+
- `blosc2.CTable.open("table.b2z")` is read-only by default
61+
62+
### Concrete user surprise
63+
64+
For a `.b2z` store, append mode currently does extra work:
65+
66+
1. create a working directory (usually temporary)
67+
2. extract the archive into that working directory
68+
3. serve reads/writes from the extracted layout
69+
4. repack on close
70+
71+
This is implemented in
72+
[src/blosc2/dict_store.py](/Users/faltet/blosc/python-blosc2/src/blosc2/dict_store.py).
73+
74+
That behavior is reasonable when the caller explicitly asked for `"a"`, but
75+
surprising when it is triggered only because `mode` was omitted.
76+
77+
## Desired End State
78+
79+
The target behavior is:
80+
81+
```python
82+
blosc2.open(path)
83+
```
84+
85+
should behave as if the user had written:
86+
87+
```python
88+
blosc2.open(path, mode="r")
89+
```
90+
91+
unless the object category does not support read-only opening for technical
92+
reasons. In such cases, the exception should be explicit and documented.
93+
94+
The user should need to opt into mutation with:
95+
96+
- `mode="a"`
97+
- `mode="w"`
98+
99+
## Design Principles
100+
101+
The migration should follow these rules:
102+
103+
- do not silently change semantics without a warning phase
104+
- make the warning text concrete and actionable
105+
- update all docs and examples before flipping the default
106+
- keep the opt-in writable paths unchanged
107+
- avoid introducing ambiguity about whether a store may be mutated
108+
- prefer explicit `mode=` in library docs even after the default changes
109+
110+
## Recommended Rollout
111+
112+
### Phase 0: prepare the codebase
113+
114+
Before warning users:
115+
116+
1. audit internal calls to `blosc2.open(...)`
117+
2. make all internal call sites spell out `mode=`
118+
3. update examples, docs, and tests to use explicit modes
119+
4. document the difference between:
120+
- `mode="r"`: inspect/query only
121+
- `mode="a"`: may unpack and repack stores
122+
- `mode="w"`: overwrite/create
123+
124+
This phase reduces ambiguity and makes later warning noise much more useful.
125+
126+
### Phase 1: deprecation warning
127+
128+
Keep the runtime default as `"a"`, but emit a `FutureWarning` when:
129+
130+
- `blosc2.open(...)` is called without an explicit `mode=`
131+
132+
The warning should fire only when `mode` was omitted, not when the caller
133+
explicitly requested `"a"`.
134+
135+
Recommended warning text:
136+
137+
```python
138+
FutureWarning(
139+
"blosc2.open() currently defaults to mode='a', but this will change "
140+
"to mode='r' in a future release. Pass mode='a' explicitly to keep "
141+
"writable behavior, or mode='r' for read-only access."
142+
)
143+
```
144+
145+
Notes:
146+
147+
- the wording should mention both the current and future defaults
148+
- the wording should explain how to preserve current behavior
149+
- the wording should not be container-specific
150+
151+
### Phase 2: flip the default
152+
153+
In the next planned breaking-compatible release window:
154+
155+
- change the default mode in `blosc2.open(...)` from `"a"` to `"r"`
156+
157+
At that point:
158+
159+
- calls with omitted `mode` become read-only
160+
- code that needs writable behavior must use `mode="a"` explicitly
161+
162+
### Phase 3: remove warning-specific scaffolding
163+
164+
After the default flip has been out for one full release cycle:
165+
166+
- remove temporary warning helpers and migration notes that are no longer
167+
useful
168+
- keep release notes and changelog entries for historical context
169+
170+
## Implementation Notes
171+
172+
### Tracking whether `mode` was omitted
173+
174+
To emit a warning only when appropriate, `blosc2.open(...)` needs to
175+
distinguish:
176+
177+
- caller omitted `mode`
178+
- caller passed `mode="a"` explicitly
179+
180+
A practical implementation is:
181+
182+
1. change the function signature internally to use a sentinel
183+
2. resolve the effective mode inside the function
184+
3. warn only when the sentinel path is used
185+
186+
For example:
187+
188+
```python
189+
_MODE_SENTINEL = object()
190+
191+
192+
def open(urlpath, mode=_MODE_SENTINEL, **kwargs):
193+
mode_was_omitted = mode is _MODE_SENTINEL
194+
if mode_was_omitted:
195+
mode = "a" # Phase 1
196+
warnings.warn(...)
197+
```
198+
199+
Later, in Phase 2:
200+
201+
```python
202+
if mode_was_omitted:
203+
mode = "r"
204+
```
205+
206+
This is better than relying on `mode="a"` in the signature because that
207+
signature cannot tell whether the user explicitly passed `"a"`.
208+
209+
### Scope of change
210+
211+
This plan is specifically about `blosc2.open(...)`.
212+
213+
It does **not** require changing the defaults of:
214+
215+
- `DictStore(...)`
216+
- `TreeStore(...)`
217+
- `CTable(...)`
218+
219+
at the same time.
220+
221+
However, the docs should explain that:
222+
223+
- constructor-style APIs may still default to `"a"`
224+
- generic `blosc2.open(...)` becomes read-only by default
225+
226+
This narrower scope reduces breakage and focuses on the highest-surprise entry
227+
point first.
228+
229+
## Compatibility Risks
230+
231+
The main breakage risk is downstream code that relies on:
232+
233+
```python
234+
obj = blosc2.open(path)
235+
obj[...] = ...
236+
```
237+
238+
without ever spelling out `mode="a"`.
239+
240+
After the default flip, that code may:
241+
242+
- fail with a read-only error
243+
- stop persisting modifications
244+
- expose behavior differences only at runtime
245+
246+
This is why the warning phase is important.
247+
248+
### Secondary risk: tests that mutate after open
249+
250+
Internal and downstream tests may open objects generically and then mutate
251+
them. These need to be found and updated during Phase 0.
252+
253+
### Secondary risk: docs and notebooks
254+
255+
Tutorials that currently omit `mode=` may accidentally teach users the old
256+
behavior. These should be updated before the warning phase begins.
257+
258+
## Documentation Changes
259+
260+
### API docs
261+
262+
Update the docstring for `blosc2.open(...)` to:
263+
264+
- describe the migration
265+
- clearly document the meaning of each mode
266+
- mention that read-only is the recommended mode for inspection/querying
267+
268+
### Examples
269+
270+
Update examples to use explicit modes consistently:
271+
272+
- inspection/querying: `mode="r"`
273+
- mutation of existing stores: `mode="a"`
274+
- create/overwrite: `mode="w"`
275+
276+
### User-facing migration note
277+
278+
Add a short migration note to release notes:
279+
280+
-`blosc2.open()` now defaults to read-only; pass `mode='a'` explicitly if
281+
you need writable behavior.”
282+
283+
## Testing Plan
284+
285+
### Phase 1 tests
286+
287+
Add tests that verify:
288+
289+
- omitted `mode` emits `FutureWarning`
290+
- explicit `mode="a"` does not warn
291+
- explicit `mode="r"` does not warn
292+
- effective behavior remains writable during the warning phase
293+
294+
### Phase 2 tests
295+
296+
After the flip, add/update tests that verify:
297+
298+
- omitted `mode` is read-only
299+
- writes after omitted-mode open fail clearly
300+
- explicit `mode="a"` still allows mutation
301+
- `.b2z` omitted-mode open does not enter append-style write setup
302+
303+
### Documentation tests
304+
305+
Where practical, examples should use explicit `mode=` so doctests remain clear
306+
and stable across the transition.
307+
308+
## Optional Compatibility Escape Hatch
309+
310+
If downstream breakage risk is considered high, one temporary option is an
311+
environment-variable override for one transition cycle, for example:
312+
313+
- `BLOSC2_OPEN_DEFAULT_MODE=a`
314+
315+
This should only be used if needed. It adds complexity and should not become a
316+
permanent configuration surface unless there is a strong operational reason.
317+
318+
## Related Follow-Up Worth Considering
319+
320+
Even if the default changes to `"r"`, append mode for `.b2z` may still be more
321+
eager than desirable.
322+
323+
A separate improvement could make `.b2z` append behavior lazier:
324+
325+
- open in `"a"` without extracting immediately
326+
- extract only on first mutation
327+
- keep read-only-style fast paths for pure reads
328+
329+
That is orthogonal to the default-mode change and can be planned separately.
330+
331+
## Summary
332+
333+
The recommended path is:
334+
335+
1. make internal/docs/example usage explicit
336+
2. add a `FutureWarning` when `blosc2.open(...)` is called without `mode=`
337+
3. flip the default from `"a"` to `"r"` in the next suitable release window
338+
4. keep writable behavior available via explicit `mode="a"`
339+
340+
This delivers a safer and less surprising user experience while still giving
341+
existing code a clear migration path.

src/blosc2/schunk.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1835,6 +1835,7 @@ def open(
18351835
# Resolve the sentinel before URLPath check so we can raise the correct
18361836
# error without also triggering the deprecation warning for invalid calls.
18371837
if mode is _OPEN_MODE_SENTINEL:
1838+
# TODO: remove the sentinel/FutureWarning path once blosc2.open() defaults to mode="r".
18381839
warnings.warn(
18391840
"blosc2.open() currently defaults to mode='a', but this will change "
18401841
"to mode='r' in a future release. Pass mode='a' explicitly to keep "

0 commit comments

Comments
 (0)