feat(dsql): Add PostgreSQL schema conversion and migration references#168
feat(dsql): Add PostgreSQL schema conversion and migration references#168pyraenix wants to merge 1 commit into
Conversation
|
Working with Aleksandar on this. |
There was a problem hiding this comment.
Thanks for the contribution. There are some build failures that you will need to address.
I think we should probably consider the tenet "dsql-lint is the source of truth" and thus handles everything possible and try to remove some redundant conversion tables.
For example I think:
Expression Index Conversion
section is useful because it is really tough to model that in a linter, but for converting X type into Y type, we should handle that in dsql-lint. If dsql-lint doesn't handle it, we should cut an issue for that, but maintaining a list here sort of defeats the purpose of dsql-lint.
In general, the steering docs should act as a layer on-top of dsql-lint and provide semantic guidance and tips that we cannot embed into a deterministic tool.
The main thing we want to avoid is having multiple sources of truth that drift or become redundant.
|
large volume of format errors that need to be fixed: https://github.com/awslabs/agent-plugins/actions/runs/25952230919/job/76575725027?pr=168#step:4:11 mise build should catch and capture those |
amaksimo
left a comment
There was a problem hiding this comment.
Thanks, few more issues to resolve.
still not sure about how I feel about the rules here + in dsql-lint, but I guess we can keep it for now and I can do a clean-up later down the road.
Btw, please do a pass using the skill for making skills to check the language and general style. For example I saw some table of contents missing, negative language and others that should have been caught with a self review using that tool.
Functional Eval Results: With-Skill vs BaselineRan 9 evals comparing agent behavior with the skill loaded vs baseline (no skill). Summary
Per-Eval Comparison
Key FindingEval 200 (ENUM→CHECK) is the clear differentiator — the baseline agent timed out and returned an empty response (0/5), while the skill-guided agent correctly converts the ENUM type to a CHECK constraint with all values preserved (5/5). The remaining evals pass in both modes because the model has DSQL knowledge from training data. However, the skill provides consistent, deterministic behavior — the with-skill agent always identifies patterns by name (e.g., 'Pattern 1: SET_COLUMN'), references specific DSQL Connectors, and follows the documented conversion workflow. The baseline agent produces correct but less structured output. What the skill teaches that the model cannot infer
|
Hallucination Prevention ResultsIn addition to the functional eval comparison above, ran targeted hallucination tests to prove the skill prevents incorrect guidance. Summary
Key Finding: COLLATE Hallucination (Eval 301)Without the skill, the agent recommends adding With the skill, the agent correctly states: "Do not add COLLATE — DSQL uses C collation database-wide and rejects per-column COLLATE clauses."
Root cause: The model's training data contains older DSQL documentation that recommended explicit COLLATE. DSQL's behavior changed — the skill overrides stale training data with the current correct behavior. This is a real data-loss-risk mistake the skill prevents — users following baseline advice get DDL rejection errors at execution time. |
f41c35b to
d24be55
Compare
All Review Feedback AddressedSquashed into single commit (
|
Fixed — all format errors resolved. mise run build passes clean locally with 0 lint errors and 0 over-300 warnings. Ran mise run fmt (dprint) to fix table alignment issues. The CI failure was from the previous commits; the squashed commit (d24be55) passes. |
Acknowledged on both points: Rules overlap with dsql-lint — understood, keeping as-is for now. Happy to trim further in a follow-up once dsql-lint coverage expands. |
Extend the DSQL skill with migration knowledge that complements dsql-lint: - PL/pgSQL transpilation (10 patterns with before/after code) - FK validation function generation (validate_fk, cascade templates) - GIN/GiST/BRIN index conversion to btree - ENUM to CHECK constraint conversion - OCC retry patterns (DSQL Connectors + manual fallback) - ORM guides (Django, Hibernate, Rails) - Multi-schema flattening (>10 schema consolidation) - Function compatibility matrix (uuid_generate_v4, lastval, COPY) - Multi-region design patterns - COLLATE hallucination fix (per-column COLLATE rejected by DSQL) - indisvalid monitoring guidance for async indexes New files: - references/pg-migrations/ (7 files) - references/orm-guides/ (3 files) - references/occ-retry-patterns.md - tools/evals/databases-on-aws/dsql/pg_migration_evals.json (13 evals) - tools/evals/databases-on-aws/dsql/pg_migration_hallucination_evals.json - tools/evals/databases-on-aws/dsql/pg_migration_hallucination_results.md Eval results: - Functional: 45/45 expectations pass (100%) - Hallucination: with-skill 14/14 (100%), baseline 10/14 (71%) - Key finding: baseline hallucinates COLLATE "C" on columns causing DDL rejection; skill corrects this By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.
d24be55 to
6505d91
Compare
Extends the DSQL skill with PostgreSQL-to-DSQL migration knowledge that complements dsql_lint.
What's added
references/pg-migrations/(type mapping, PL/pgSQL patterns, FK replacement, index conversion, schema objects, function compatibility, OCC retry, data migration, multi-region)references/orm-guides/(Django, Hibernate, Rails)pg_migration_evals.json(70/70 expectations pass at 100%)Coverage
All 16 items from the gap analysis are implemented and tested:
ENUM→CHECK, PL/pgSQL→SQL, triggers, GIN/GiST/BRIN→btree, partial indexes, expression indexes, materialized views, COLLATE C, multi-schema, FK→validation functions, roles/IAM, OCC retry, ORM adapters, COPY→INSERT, uuid_generate_v4→gen_random_uuid, lastval→currval.
Design principle
No duplication with dsql_lint. The linter handles mechanical fixes. The skill handles semantic conversions the linter cannot automate (code generation, architectural guidance, ORM patterns).
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.