Severity: MEDIUM — two tabular CSV problems pass dry-run with no warning. Found during the #67 stress sweep.
(a) Duplicate header columns silently collapsed
printf 'a,a,label\n1,2,x\n3,4,y\n' > tab/duphdr/data.csv
tracebloc dataset push ./tab/duphdr ... --category tabular_classification --dry-run
# → "Inferred schema for 2 column(s)" ✔ (the duplicate `a` was silently dropped)
Root cause: InferSchema() (internal/push/tabular.go:212) builds schema as a map[string]string keyed by column name, so the second a overwrites the first — a,a,label → 2 columns, with one column's data silently misaligned/dropped. No detection or warning.
(b) Header-only / zero-data-row CSV passes
printf 'a,b,label\n' > tab/headeronly/data.csv # header, no rows
tracebloc dataset push ./tab/headeronly ... --dry-run
# → warning only about column typing; ✔ Dry-run complete
A CSV with zero data rows types every column nullable-FLOAT and passes. The "no values in the sample" warning is about typing, not "your dataset has 0 rows" — so an empty dataset gets a green check.
Expected
- Detect repeated names in the header and error (or warn) —
Error: duplicate column "a" in header; column names must be unique.
- Detect a row count of 0 after the header and warn/error —
Error: data.csv has a header but no data rows.
Part of #67.
Severity: MEDIUM — two tabular CSV problems pass dry-run with no warning. Found during the #67 stress sweep.
(a) Duplicate header columns silently collapsed
Root cause:
InferSchema()(internal/push/tabular.go:212) buildsschemaas amap[string]stringkeyed by column name, so the secondaoverwrites the first —a,a,label→ 2 columns, with one column's data silently misaligned/dropped. No detection or warning.(b) Header-only / zero-data-row CSV passes
A CSV with zero data rows types every column nullable-FLOAT and passes. The "no values in the sample" warning is about typing, not "your dataset has 0 rows" — so an empty dataset gets a green check.
Expected
Error: duplicate column "a" in header; column names must be unique.Error: data.csv has a header but no data rows.Part of #67.