fix(database): force UTF-8 for embedded Postgres on non-UTF-8 OS locales#3735
Open
Rooseveltfj wants to merge 1 commit into
Open
fix(database): force UTF-8 for embedded Postgres on non-UTF-8 OS locales#3735Rooseveltfj wants to merge 1 commit into
Rooseveltfj wants to merge 1 commit into
Conversation
Contributor
🧪 BenchmarkShould we run the Virtual MCP strategy benchmark for this PR? React with 👍 to run the benchmark.
Benchmark will run on the next push after you react. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What is this contribution about?
On a Windows host with a non-UTF-8 OS locale (e.g. Portuguese_Brazil.1252),
the embedded Postgres ends up on WIN1252 at two layers:
initdbcreates thecluster using the OS locale, and the pg client connection defaults
client_encodingto WIN1252. Any non-Latin1 character in seeded data (e.g. the→arrows in the default connection metadata) then fails to encode and theINSERT throws 22P05 (untranslatable character).
The impact is not just a log error: it aborts the organization-creation hook
during local admin seeding, leaving the database half-seeded — the user can
then never log in ("Local admin user not found").
This forces UTF-8 at both layers:
initdbruns with--encoding=UTF8 --locale=C, so the cluster is created inUTF-8 regardless of the OS locale.
options: "-c client_encoding=UTF8", so the sessionencoding is UTF-8 on every connection.
No-op on systems already on a UTF-8 locale (macOS/Linux).
How to Test
.decobun run devand open http://localhost:3000Migration Notes
The
initdbflags apply to newly created clusters; theclient_encodingoption applies to every new connection. Existing UTF-8 setups are unchanged.
Review Checklist
Summary by cubic
Forces UTF-8 for embedded Postgres and client sessions to avoid encoding errors on Windows non‑UTF‑8 locales. Prevents seeding failures (22P05) and allows local admin auto‑login; no‑op on UTF‑8 systems.
Bug Fixes
initdbwith--encoding=UTF8 --locale=C --locale-provider=libcfor embedded clusters.options: "-c client_encoding=UTF8"for all Postgres connections.Migration
initdbflags affect new clusters only; existing UTF‑8 setups are unchanged.Written for commit 0394736. Summary will update on new commits.