Skip to content

Commit f504ad0

Browse files
committed
Simplify the plan for ctable schema
1 parent 0efd450 commit f504ad0

1 file changed

Lines changed: 7 additions & 45 deletions

File tree

plans/ctable-schema.md

Lines changed: 7 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -122,22 +122,17 @@ Expected shape:
122122
b2.field(
123123
b2.float64(ge=0, le=100),
124124
default=...,
125-
default_factory=...,
126125
cparams=...,
127126
dparams=...,
128127
chunks=...,
129128
blocks=...,
130-
title=...,
131-
description=...,
132-
nullable=...,
133129
)
134130
```
135131

136132
At minimum for the first version:
137133

138134
* `spec`
139135
* `default`
140-
* `default_factory`
141136
* `cparams`
142137
* `dparams`
143138
* `chunks`
@@ -388,7 +383,6 @@ b2.bool()
388383
Internal common fields:
389384

390385
* `dtype`
391-
* `nullable`
392386
* `constraints`
393387
* `python_type`
394388

@@ -480,8 +474,6 @@ For each field, produce a `CompiledColumn` object containing:
480474
* `spec`
481475
* `dtype`
482476
* `default`
483-
* `default_factory`
484-
* `nullable`
485477
* `cparams`
486478
* `dparams`
487479
* `chunks`
@@ -538,19 +530,16 @@ Examples:
538530

539531
```python
540532
active: bool = b2.field(b2.bool(), default=True)
541-
tags: list[str] = b2.field(..., default_factory=list)
542533
```
543534

544535
For the first implementation, keep this conservative:
545536

546537
* support scalar defaults
547-
* support `default_factory` only if there is a clear use case
548538
* reject mutable defaults directly
549539

550540
On insert:
551541

552542
* omitted values should be filled from defaults
553-
* explicit `None` should be accepted only if the field is nullable
554543

555544
---
556545

@@ -721,19 +710,7 @@ In other words:
721710
* `id = b2.field(b2.int64(ge=0))` is not the preferred style because it drops
722711
the Python annotation
723712

724-
### 2. Where should nullability live?
725-
726-
Recommended answer: on the schema spec.
727-
728-
Example:
729-
730-
```python
731-
name: str | None = b2.field(b2.string(max_length=32, nullable=True))
732-
```
733-
734-
The Python annotation and schema spec should agree.
735-
736-
### 3. Should `b2.field()` require a spec?
713+
### 2. Should `b2.field()` require a spec?
737714

738715
Recommended answer: yes for the first version.
739716

@@ -748,7 +725,7 @@ active: bool = True
748725

749726
but once `b2.field(...)` is used, it should carry an explicit schema spec.
750727

751-
### 4. How much should Pydantic-specific behavior leak?
728+
### 3. How much should Pydantic-specific behavior leak?
752729

753730
Recommended answer: as little as possible.
754731

@@ -785,49 +762,39 @@ Proposed public classes and functions:
785762
class SchemaSpec:
786763
dtype: np.dtype
787764
python_type: type[Any]
788-
nullable: bool
789765

790766
def to_pydantic_kwargs(self) -> dict[str, Any]: ...
791767
def to_metadata_dict(self) -> dict[str, Any]: ...
792768

793769

794770
class int64(SchemaSpec):
795-
def __init__(
796-
self, *, ge=None, gt=None, le=None, lt=None, nullable: bool = False
797-
): ...
771+
def __init__(self, *, ge=None, gt=None, le=None, lt=None): ...
798772

799773

800774
class float64(SchemaSpec):
801-
def __init__(
802-
self, *, ge=None, gt=None, le=None, lt=None, nullable: bool = False
803-
): ...
775+
def __init__(self, *, ge=None, gt=None, le=None, lt=None): ...
804776

805777

806778
class bool(SchemaSpec):
807-
def __init__(self, *, nullable: bool = False): ...
779+
def __init__(self): ...
808780

809781

810782
class string(SchemaSpec):
811-
def __init__(
812-
self, *, min_length=None, max_length=None, pattern=None, nullable: bool = False
813-
): ...
783+
def __init__(self, *, min_length=None, max_length=None, pattern=None): ...
814784

815785

816786
class bytes(SchemaSpec):
817-
def __init__(self, *, min_length=None, max_length=None, nullable: bool = False): ...
787+
def __init__(self, *, min_length=None, max_length=None): ...
818788

819789

820790
def field(
821791
spec: SchemaSpec,
822792
*,
823793
default=MISSING,
824-
default_factory=MISSING,
825794
cparams: dict[str, Any] | None = None,
826795
dparams: dict[str, Any] | None = None,
827796
chunks: tuple[int, ...] | None = None,
828797
blocks: tuple[int, ...] | None = None,
829-
title: str | None = None,
830-
description: str | None = None,
831798
) -> DataclassField: ...
832799
```
833800

@@ -863,8 +830,6 @@ class ColumnConfig:
863830
dparams: dict[str, Any] | None
864831
chunks: tuple[int, ...] | None
865832
blocks: tuple[int, ...] | None
866-
title: str | None
867-
description: str | None
868833

869834

870835
@dataclass(slots=True)
@@ -874,7 +839,6 @@ class CompiledColumn:
874839
spec: Any
875840
dtype: np.dtype
876841
default: Any
877-
default_factory: Any
878842
config: ColumnConfig
879843

880844

@@ -1095,7 +1059,6 @@ Initial checks to support:
10951059

10961060
* numeric `ge`, `gt`, `le`, `lt`
10971061
* string and bytes `min_length`, `max_length`
1098-
* nullability
10991062
* dtype compatibility after coercion
11001063

11011064
This module should remain optional in the first PR if the rowwise path is enough
@@ -1154,7 +1117,6 @@ Test scope by file:
11541117

11551118
* Pydantic validator generation
11561119
* constraint enforcement
1157-
* nullable vs non-nullable behavior
11581120

11591121
`tests/ctable/test_ctable_dataclass_schema.py`
11601122

0 commit comments

Comments
 (0)