|
| 1 | +# Plan For Changing `blosc2.open()` Default Mode To Read-Only |
| 2 | + |
| 3 | +## Goal |
| 4 | + |
| 5 | +Change the default mode for `blosc2.open(...)` from `"a"` to `"r"` so that |
| 6 | +opening an existing object is non-mutating and unsurprising by default. |
| 7 | + |
| 8 | +The change should: |
| 9 | + |
| 10 | +- reduce accidental write access |
| 11 | +- avoid implicit unpack / rewrite work for store-backed containers |
| 12 | +- align with user expectations for a generic `open(...)` API |
| 13 | +- preserve a smooth migration path for existing code that relied on writable |
| 14 | + opens without an explicit `mode=` |
| 15 | + |
| 16 | +This plan is for later consideration and rollout design. It does not assume |
| 17 | +that the change should land immediately. |
| 18 | + |
| 19 | +## Motivation |
| 20 | + |
| 21 | +Today, `blosc2.open(...)` defaults to `"a"` in |
| 22 | +[src/blosc2/schunk.py](/Users/faltet/blosc/python-blosc2/src/blosc2/schunk.py). |
| 23 | + |
| 24 | +That means: |
| 25 | + |
| 26 | +- opening a `.b2z` store without `mode=` may create a writable working copy |
| 27 | +- append-mode store opens may unpack zip-backed stores into a temporary working |
| 28 | + directory immediately |
| 29 | +- code that only intends to inspect metadata or query data can still enter a |
| 30 | + mutation-capable path by accident |
| 31 | + |
| 32 | +This is especially surprising for: |
| 33 | + |
| 34 | +- `TreeStore` |
| 35 | +- `DictStore` |
| 36 | +- `CTable` |
| 37 | +- other container-like objects opened through the generic dispatcher |
| 38 | + |
| 39 | +By contrast, users generally expect a bare `open(path)` call to be safe for |
| 40 | +inspection unless they explicitly request write access. |
| 41 | + |
| 42 | +## Current Situation |
| 43 | + |
| 44 | +### Default values today |
| 45 | + |
| 46 | +The following default to `"a"` today: |
| 47 | + |
| 48 | +- `blosc2.open(...)` |
| 49 | +- `DictStore(...)` |
| 50 | +- `TreeStore(...)` |
| 51 | +- `CTable(...)` constructor when opening/creating through `urlpath` |
| 52 | + |
| 53 | +At the same time: |
| 54 | + |
| 55 | +- `CTable.open(...)` already defaults to `"r"` |
| 56 | + |
| 57 | +This creates an inconsistency where: |
| 58 | + |
| 59 | +- `blosc2.open("table.b2z")` is writable by default |
| 60 | +- `blosc2.CTable.open("table.b2z")` is read-only by default |
| 61 | + |
| 62 | +### Concrete user surprise |
| 63 | + |
| 64 | +For a `.b2z` store, append mode currently does extra work: |
| 65 | + |
| 66 | +1. create a working directory (usually temporary) |
| 67 | +2. extract the archive into that working directory |
| 68 | +3. serve reads/writes from the extracted layout |
| 69 | +4. repack on close |
| 70 | + |
| 71 | +This is implemented in |
| 72 | +[src/blosc2/dict_store.py](/Users/faltet/blosc/python-blosc2/src/blosc2/dict_store.py). |
| 73 | + |
| 74 | +That behavior is reasonable when the caller explicitly asked for `"a"`, but |
| 75 | +surprising when it is triggered only because `mode` was omitted. |
| 76 | + |
| 77 | +## Desired End State |
| 78 | + |
| 79 | +The target behavior is: |
| 80 | + |
| 81 | +```python |
| 82 | +blosc2.open(path) |
| 83 | +``` |
| 84 | + |
| 85 | +should behave as if the user had written: |
| 86 | + |
| 87 | +```python |
| 88 | +blosc2.open(path, mode="r") |
| 89 | +``` |
| 90 | + |
| 91 | +unless the object category does not support read-only opening for technical |
| 92 | +reasons. In such cases, the exception should be explicit and documented. |
| 93 | + |
| 94 | +The user should need to opt into mutation with: |
| 95 | + |
| 96 | +- `mode="a"` |
| 97 | +- `mode="w"` |
| 98 | + |
| 99 | +## Design Principles |
| 100 | + |
| 101 | +The migration should follow these rules: |
| 102 | + |
| 103 | +- do not silently change semantics without a warning phase |
| 104 | +- make the warning text concrete and actionable |
| 105 | +- update all docs and examples before flipping the default |
| 106 | +- keep the opt-in writable paths unchanged |
| 107 | +- avoid introducing ambiguity about whether a store may be mutated |
| 108 | +- prefer explicit `mode=` in library docs even after the default changes |
| 109 | + |
| 110 | +## Recommended Rollout |
| 111 | + |
| 112 | +### Phase 0: prepare the codebase |
| 113 | + |
| 114 | +Before warning users: |
| 115 | + |
| 116 | +1. audit internal calls to `blosc2.open(...)` |
| 117 | +2. make all internal call sites spell out `mode=` |
| 118 | +3. update examples, docs, and tests to use explicit modes |
| 119 | +4. document the difference between: |
| 120 | + - `mode="r"`: inspect/query only |
| 121 | + - `mode="a"`: may unpack and repack stores |
| 122 | + - `mode="w"`: overwrite/create |
| 123 | + |
| 124 | +This phase reduces ambiguity and makes later warning noise much more useful. |
| 125 | + |
| 126 | +### Phase 1: deprecation warning |
| 127 | + |
| 128 | +Keep the runtime default as `"a"`, but emit a `FutureWarning` when: |
| 129 | + |
| 130 | +- `blosc2.open(...)` is called without an explicit `mode=` |
| 131 | + |
| 132 | +The warning should fire only when `mode` was omitted, not when the caller |
| 133 | +explicitly requested `"a"`. |
| 134 | + |
| 135 | +Recommended warning text: |
| 136 | + |
| 137 | +```python |
| 138 | +FutureWarning( |
| 139 | + "blosc2.open() currently defaults to mode='a', but this will change " |
| 140 | + "to mode='r' in a future release. Pass mode='a' explicitly to keep " |
| 141 | + "writable behavior, or mode='r' for read-only access." |
| 142 | +) |
| 143 | +``` |
| 144 | + |
| 145 | +Notes: |
| 146 | + |
| 147 | +- the wording should mention both the current and future defaults |
| 148 | +- the wording should explain how to preserve current behavior |
| 149 | +- the wording should not be container-specific |
| 150 | + |
| 151 | +### Phase 2: flip the default |
| 152 | + |
| 153 | +In the next planned breaking-compatible release window: |
| 154 | + |
| 155 | +- change the default mode in `blosc2.open(...)` from `"a"` to `"r"` |
| 156 | + |
| 157 | +At that point: |
| 158 | + |
| 159 | +- calls with omitted `mode` become read-only |
| 160 | +- code that needs writable behavior must use `mode="a"` explicitly |
| 161 | + |
| 162 | +### Phase 3: remove warning-specific scaffolding |
| 163 | + |
| 164 | +After the default flip has been out for one full release cycle: |
| 165 | + |
| 166 | +- remove temporary warning helpers and migration notes that are no longer |
| 167 | + useful |
| 168 | +- keep release notes and changelog entries for historical context |
| 169 | + |
| 170 | +## Implementation Notes |
| 171 | + |
| 172 | +### Tracking whether `mode` was omitted |
| 173 | + |
| 174 | +To emit a warning only when appropriate, `blosc2.open(...)` needs to |
| 175 | +distinguish: |
| 176 | + |
| 177 | +- caller omitted `mode` |
| 178 | +- caller passed `mode="a"` explicitly |
| 179 | + |
| 180 | +A practical implementation is: |
| 181 | + |
| 182 | +1. change the function signature internally to use a sentinel |
| 183 | +2. resolve the effective mode inside the function |
| 184 | +3. warn only when the sentinel path is used |
| 185 | + |
| 186 | +For example: |
| 187 | + |
| 188 | +```python |
| 189 | +_MODE_SENTINEL = object() |
| 190 | + |
| 191 | + |
| 192 | +def open(urlpath, mode=_MODE_SENTINEL, **kwargs): |
| 193 | + mode_was_omitted = mode is _MODE_SENTINEL |
| 194 | + if mode_was_omitted: |
| 195 | + mode = "a" # Phase 1 |
| 196 | + warnings.warn(...) |
| 197 | +``` |
| 198 | + |
| 199 | +Later, in Phase 2: |
| 200 | + |
| 201 | +```python |
| 202 | +if mode_was_omitted: |
| 203 | + mode = "r" |
| 204 | +``` |
| 205 | + |
| 206 | +This is better than relying on `mode="a"` in the signature because that |
| 207 | +signature cannot tell whether the user explicitly passed `"a"`. |
| 208 | + |
| 209 | +### Scope of change |
| 210 | + |
| 211 | +This plan is specifically about `blosc2.open(...)`. |
| 212 | + |
| 213 | +It does **not** require changing the defaults of: |
| 214 | + |
| 215 | +- `DictStore(...)` |
| 216 | +- `TreeStore(...)` |
| 217 | +- `CTable(...)` |
| 218 | + |
| 219 | +at the same time. |
| 220 | + |
| 221 | +However, the docs should explain that: |
| 222 | + |
| 223 | +- constructor-style APIs may still default to `"a"` |
| 224 | +- generic `blosc2.open(...)` becomes read-only by default |
| 225 | + |
| 226 | +This narrower scope reduces breakage and focuses on the highest-surprise entry |
| 227 | +point first. |
| 228 | + |
| 229 | +## Compatibility Risks |
| 230 | + |
| 231 | +The main breakage risk is downstream code that relies on: |
| 232 | + |
| 233 | +```python |
| 234 | +obj = blosc2.open(path) |
| 235 | +obj[...] = ... |
| 236 | +``` |
| 237 | + |
| 238 | +without ever spelling out `mode="a"`. |
| 239 | + |
| 240 | +After the default flip, that code may: |
| 241 | + |
| 242 | +- fail with a read-only error |
| 243 | +- stop persisting modifications |
| 244 | +- expose behavior differences only at runtime |
| 245 | + |
| 246 | +This is why the warning phase is important. |
| 247 | + |
| 248 | +### Secondary risk: tests that mutate after open |
| 249 | + |
| 250 | +Internal and downstream tests may open objects generically and then mutate |
| 251 | +them. These need to be found and updated during Phase 0. |
| 252 | + |
| 253 | +### Secondary risk: docs and notebooks |
| 254 | + |
| 255 | +Tutorials that currently omit `mode=` may accidentally teach users the old |
| 256 | +behavior. These should be updated before the warning phase begins. |
| 257 | + |
| 258 | +## Documentation Changes |
| 259 | + |
| 260 | +### API docs |
| 261 | + |
| 262 | +Update the docstring for `blosc2.open(...)` to: |
| 263 | + |
| 264 | +- describe the migration |
| 265 | +- clearly document the meaning of each mode |
| 266 | +- mention that read-only is the recommended mode for inspection/querying |
| 267 | + |
| 268 | +### Examples |
| 269 | + |
| 270 | +Update examples to use explicit modes consistently: |
| 271 | + |
| 272 | +- inspection/querying: `mode="r"` |
| 273 | +- mutation of existing stores: `mode="a"` |
| 274 | +- create/overwrite: `mode="w"` |
| 275 | + |
| 276 | +### User-facing migration note |
| 277 | + |
| 278 | +Add a short migration note to release notes: |
| 279 | + |
| 280 | +- “`blosc2.open()` now defaults to read-only; pass `mode='a'` explicitly if |
| 281 | + you need writable behavior.” |
| 282 | + |
| 283 | +## Testing Plan |
| 284 | + |
| 285 | +### Phase 1 tests |
| 286 | + |
| 287 | +Add tests that verify: |
| 288 | + |
| 289 | +- omitted `mode` emits `FutureWarning` |
| 290 | +- explicit `mode="a"` does not warn |
| 291 | +- explicit `mode="r"` does not warn |
| 292 | +- effective behavior remains writable during the warning phase |
| 293 | + |
| 294 | +### Phase 2 tests |
| 295 | + |
| 296 | +After the flip, add/update tests that verify: |
| 297 | + |
| 298 | +- omitted `mode` is read-only |
| 299 | +- writes after omitted-mode open fail clearly |
| 300 | +- explicit `mode="a"` still allows mutation |
| 301 | +- `.b2z` omitted-mode open does not enter append-style write setup |
| 302 | + |
| 303 | +### Documentation tests |
| 304 | + |
| 305 | +Where practical, examples should use explicit `mode=` so doctests remain clear |
| 306 | +and stable across the transition. |
| 307 | + |
| 308 | +## Optional Compatibility Escape Hatch |
| 309 | + |
| 310 | +If downstream breakage risk is considered high, one temporary option is an |
| 311 | +environment-variable override for one transition cycle, for example: |
| 312 | + |
| 313 | +- `BLOSC2_OPEN_DEFAULT_MODE=a` |
| 314 | + |
| 315 | +This should only be used if needed. It adds complexity and should not become a |
| 316 | +permanent configuration surface unless there is a strong operational reason. |
| 317 | + |
| 318 | +## Related Follow-Up Worth Considering |
| 319 | + |
| 320 | +Even if the default changes to `"r"`, append mode for `.b2z` may still be more |
| 321 | +eager than desirable. |
| 322 | + |
| 323 | +A separate improvement could make `.b2z` append behavior lazier: |
| 324 | + |
| 325 | +- open in `"a"` without extracting immediately |
| 326 | +- extract only on first mutation |
| 327 | +- keep read-only-style fast paths for pure reads |
| 328 | + |
| 329 | +That is orthogonal to the default-mode change and can be planned separately. |
| 330 | + |
| 331 | +## Summary |
| 332 | + |
| 333 | +The recommended path is: |
| 334 | + |
| 335 | +1. make internal/docs/example usage explicit |
| 336 | +2. add a `FutureWarning` when `blosc2.open(...)` is called without `mode=` |
| 337 | +3. flip the default from `"a"` to `"r"` in the next suitable release window |
| 338 | +4. keep writable behavior available via explicit `mode="a"` |
| 339 | + |
| 340 | +This delivers a safer and less surprising user experience while still giving |
| 341 | +existing code a clear migration path. |
0 commit comments