Skip to content

fix(continuous-learning-v2): reconfigure stdout to UTF-8 on Windows#1328

Open
georgeradcj wants to merge 1 commit intoaffaan-m:mainfrom
georgeradcj:fix/instinct-cli-md-utf8
Open

fix(continuous-learning-v2): reconfigure stdout to UTF-8 on Windows#1328
georgeradcj wants to merge 1 commit intoaffaan-m:mainfrom
georgeradcj:fix/instinct-cli-md-utf8

Conversation

@georgeradcj
Copy link
Copy Markdown

@georgeradcj georgeradcj commented Apr 8, 2026

Summary

Narrowed after rebasing onto latest `main`: #216 (.md glob) and #353 (file I/O UTF-8) already landed, so this PR now only contains the remaining, orthogonal Windows-console fix.

`instinct-cli.py status` renders a confidence bar using the box-drawing characters `█` / `░`. On Windows, `sys.stdout` inherits the console codec (cp1250 / cp1252 in non-UTF-8 locales), so the `print` call crashes with:

```
UnicodeEncodeError: 'charmap' codec can't encode characters in position 4-13: character maps to
```

The two earlier UTF-8 fixes cover reading instinct files — this one covers writing to the console.

Fix

Reconfigure `sys.stdout` / `sys.stderr` to UTF-8 at startup, guarded by `sys.platform == "win32"`. Wrapped in `try/except` because some captured streams (tests, certain wrappers) don't support `reconfigure`. No-op on Linux/macOS, where stdout is already UTF-8.

Repro (before this PR, on top of current `main`)

```

python skills/continuous-learning-v2/scripts/instinct-cli.py status
============================================================
INSTINCT STATUS - 155 total
============================================================
Project: everything-claude-code
Project instincts: 0
Global instincts: 155

GLOBAL (apply to all projects)

AI-ASSISTANT (7)

Traceback (most recent call last):
...
File "C:\Program Files\Python314\Lib\encodings\cp1250.py", line 19, in encode
return codecs.charmap_encode(input, self.errors, encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 4-13: character maps to
```

After

```

python skills/continuous-learning-v2/scripts/instinct-cli.py status
============================================================
INSTINCT STATUS - 155 total
============================================================
Project: everything-claude-code
Global instincts: 155

GLOBAL (apply to all projects)

AI-ASSISTANT (7)

█████████░  90%  ai-tool-description-must-list-return-data

...
```

Test plan

  • Windows 11 + Python 3.14, cp1250 console (`chcp` = 1250): `status` lists all 155 instincts without crashing.
  • Force `chcp 65001` beforehand: still works (reconfigure is a no-op when stdout is already UTF-8).
  • Linux/macOS regression: the `sys.platform == "win32"` guard should make this a no-op, but worth a CI sanity check.

Notes on bot reviews from the previous revision

The earlier revision of this PR duplicated changes that have since landed on `main` via #216 and #353. The reviewer comments about `cmd_import` / `cmd_export` / observations file encoding are already fixed on `main` — the new diff only touches stdout reconfigure.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

The change updates instinct-cli.py to add Windows-specific UTF-8 encoding for stdout/stderr output, extend instinct discovery to load both .md and .yaml files (deduplicated and sorted), and explicitly specify UTF-8 encoding when reading file contents.

Changes

Cohort / File(s) Summary
Instinct CLI Encoding & Discovery
skills/continuous-learning-v2/scripts/instinct-cli.py
Added Windows stdout/stderr UTF-8 reconfiguration with fallback exception handling. Extended load_all_instincts() to discover both .md and .yaml files (deduplicated, sorted) instead of only .yaml. Updated file reading to explicitly use UTF-8 encoding.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Poem

🐰 With UTF-8 streams on Windows so bright,
And markdown tales dancing alongside YAML's might,
The instincts now flourish in duplicate-free rows,
Each byte encoded perfectly—as the rabbit knows! 📚

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title mentions only the Windows UTF-8 reconfiguration fix, but the PR's main objectives include two equally important changes: loading .md instincts (which fixed the 'No instincts found' bug) and UTF-8 file reading. The title partially captures the changeset but omits the primary bug fix. Update the title to reflect both critical fixes, such as: 'fix(continuous-learning-v2): load .md instincts and handle UTF-8 on Windows' to match the actual scope of changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 8, 2026

Greptile Summary

This PR adds a Windows-only sys.stdout / sys.stderr UTF-8 reconfiguration block to instinct-cli.py, fixing a UnicodeEncodeError that crashed the status command when rendering the █/░ confidence-bar characters under the default cp1250/cp1252 console codecs. The .md glob fix and file-read encoding="utf-8" changes described in the PR description were already landed in prior PRs (#216, #353); this commit solely addresses the output-side encoding.

Confidence Score: 5/5

Safe to merge — the change is a single, minimal, platform-guarded block that is a no-op on Linux/macOS and well-handled via try/except on Windows.

The only changed code is a 10-line sys.platform == "win32" guard that reconfigures stdout/stderr to UTF-8, wrapped in try/except Exception: pass. No logic paths change, the fix is idempotent on non-Windows platforms, and failure is safe (falls back to pre-fix behaviour). All remaining findings are P2 or below.

No files require special attention.

Vulnerabilities

No security concerns identified.

Important Files Changed

Filename Overview
skills/continuous-learning-v2/scripts/instinct-cli.py Adds a sys.platform == "win32" guarded block to reconfigure stdout/stderr to UTF-8, wrapped in try/except Exception: pass to silently no-op in piped/test environments. Minimal, targeted, and safe.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[instinct-cli.py starts] --> B{sys.platform == 'win32'?}
    B -- Yes --> C[try: reconfigure stdout + stderr to UTF-8]
    C --> D{reconfigure succeeds?}
    D -- Yes --> E[stdout/stderr: UTF-8 ✓]
    D -- No / AttributeError / IOError --> F[silent pass – continue with default codec]
    B -- No --> G[Linux/macOS: already UTF-8, no-op]
    E --> H[status / evolve / etc. print █░ bars without UnicodeEncodeError]
    F --> H
    G --> H
Loading

Reviews (2): Last reviewed commit: "fix(continuous-learning-v2): reconfigure..." | Re-trigger Greptile

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
skills/continuous-learning-v2/scripts/instinct-cli.py (2)

204-208: ⚠️ Potential issue | 🟡 Minor

Inconsistent encoding handling in cmd_import for local files.

Line 208 reads local files using path.read_text() without specifying encoding, while the fix at line 107 explicitly uses encoding="utf-8". For consistency, local file imports should also use UTF-8.

Suggested fix
         content = path.read_text()
+        content = path.read_text(encoding="utf-8")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/scripts/instinct-cli.py` around lines 204 -
208, The local-file import in cmd_import reads the file with path.read_text()
without an encoding; change it to read using UTF-8 (e.g., call
path.read_text(encoding="utf-8") or open with encoding="utf-8") so it matches
the explicit UTF-8 handling used elsewhere (see the earlier fix around line
107); update the reading of the variable content from path accordingly to ensure
consistent UTF-8 decoding.

177-178: ⚠️ Potential issue | 🟡 Minor

Missing UTF-8 encoding for observations file.

Line 178 uses open(OBSERVATIONS_FILE) without specifying encoding, which will use the system default on Windows (cp1250/cp1252) - the same issue being fixed elsewhere. This could fail on files with non-ASCII characters.

Suggested fix
     if OBSERVATIONS_FILE.exists():
-        obs_count = sum(1 for _ in open(OBSERVATIONS_FILE))
+        obs_count = sum(1 for _ in open(OBSERVATIONS_FILE, encoding="utf-8"))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/scripts/instinct-cli.py` around lines 177 -
178, The observation-counting code opens OBSERVATIONS_FILE without an explicit
encoding; change it to open the file with UTF-8 to avoid platform-dependent
decoding errors (use a context manager so the file is closed), e.g., replace the
open(OBSERVATIONS_FILE) usage in the block that sets obs_count with a with
open(OBSERVATIONS_FILE, encoding="utf-8") as f and count lines from f; reference
OBSERVATIONS_FILE and the obs_count assignment to locate the change.
🧹 Nitpick comments (2)
skills/continuous-learning-v2/scripts/instinct-cli.py (2)

299-299: Consider specifying UTF-8 encoding for write operations.

For consistency with the UTF-8 fixes applied to read operations, write_text() calls should also explicitly specify encoding="utf-8" to ensure correct behavior on Windows.

Suggested fix
-    output_file.write_text(output_content)
+    output_file.write_text(output_content, encoding="utf-8")

Similarly for line 350:

-        Path(args.output).write_text(output)
+        Path(args.output).write_text(output, encoding="utf-8")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/scripts/instinct-cli.py` at line 299, The write
operations using Path.write_text currently omit an explicit encoding; update the
calls (e.g., the occurrence invoking output_file.write_text(output_content) and
the similar write_text call later) to pass encoding="utf-8" so files are written
with UTF-8 on all platforms; locate the write_text invocations in
instinct-cli.py and change them to include the encoding argument.

23-30: Consider logging the suppressed exception instead of silently passing.

The broad except Exception: pass makes debugging difficult if reconfigure fails unexpectedly. While the fallback behavior is intentional, logging at DEBUG level would aid troubleshooting without disrupting normal operation.

Suggested improvement
+import logging
+
+logger = logging.getLogger(__name__)
+
 # Force UTF-8 stdout on Windows so ASCII-art confidence bars (█░) and
 # non-ASCII characters in instinct content don't crash under cp1250/cp1252.
 if sys.platform == "win32":
     try:
         sys.stdout.reconfigure(encoding="utf-8")
         sys.stderr.reconfigure(encoding="utf-8")
-    except Exception:
-        pass
+    except Exception as e:
+        logger.debug("Could not reconfigure stdout/stderr to UTF-8: %s", e)

Note: Ruff flags this as S110 (try-except-pass) and BLE001 (blind exception catch).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/continuous-learning-v2/scripts/instinct-cli.py` around lines 23 - 30,
The try/except around sys.stdout.reconfigure and sys.stderr.reconfigure
currently swallows all exceptions; instead catch Exception as e and log the
failure at DEBUG so failures are visible but non-fatal. Import or use the
existing logger and replace the bare except in the block that checks
sys.platform == "win32" to log a debug message referencing
sys.stdout.reconfigure/sys.stderr.reconfigure and the caught exception (e)
before continuing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@skills/continuous-learning-v2/scripts/instinct-cli.py`:
- Around line 204-208: The local-file import in cmd_import reads the file with
path.read_text() without an encoding; change it to read using UTF-8 (e.g., call
path.read_text(encoding="utf-8") or open with encoding="utf-8") so it matches
the explicit UTF-8 handling used elsewhere (see the earlier fix around line
107); update the reading of the variable content from path accordingly to ensure
consistent UTF-8 decoding.
- Around line 177-178: The observation-counting code opens OBSERVATIONS_FILE
without an explicit encoding; change it to open the file with UTF-8 to avoid
platform-dependent decoding errors (use a context manager so the file is
closed), e.g., replace the open(OBSERVATIONS_FILE) usage in the block that sets
obs_count with a with open(OBSERVATIONS_FILE, encoding="utf-8") as f and count
lines from f; reference OBSERVATIONS_FILE and the obs_count assignment to locate
the change.

---

Nitpick comments:
In `@skills/continuous-learning-v2/scripts/instinct-cli.py`:
- Line 299: The write operations using Path.write_text currently omit an
explicit encoding; update the calls (e.g., the occurrence invoking
output_file.write_text(output_content) and the similar write_text call later) to
pass encoding="utf-8" so files are written with UTF-8 on all platforms; locate
the write_text invocations in instinct-cli.py and change them to include the
encoding argument.
- Around line 23-30: The try/except around sys.stdout.reconfigure and
sys.stderr.reconfigure currently swallows all exceptions; instead catch
Exception as e and log the failure at DEBUG so failures are visible but
non-fatal. Import or use the existing logger and replace the bare except in the
block that checks sys.platform == "win32" to log a debug message referencing
sys.stdout.reconfigure/sys.stderr.reconfigure and the caught exception (e)
before continuing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9144c961-c9cf-4c1f-a13f-d7216d3a4024

📥 Commits

Reviewing files that changed from the base of the PR and between 9d766af and 54bb6be.

📒 Files selected for processing (1)
  • skills/continuous-learning-v2/scripts/instinct-cli.py

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

`instinct-cli.py status` prints a confidence bar using the box-drawing
characters █ and ░. On Windows the default console codec is cp1250 /
cp1252, and Python's stdout inherits it, so the print statement crashes
with:

    UnicodeEncodeError: 'charmap' codec can't encode characters in
    position 4-13: character maps to <undefined>

This affects any Windows user whose console is not manually set to
`chcp 65001`, even after affaan-m#353 (file I/O UTF-8) and affaan-m#216 (.md glob) —
those fixes cover reading instinct files, but the crash happens on the
*output* side when rendering the status table.

Fix: reconfigure `sys.stdout` / `sys.stderr` to UTF-8 at startup, guarded
by `sys.platform == "win32"`. The reconfigure call is wrapped in
`try/except` because a caller may pipe into something that doesn't
support reconfigure (e.g., a captured buffer in tests).

No-op on Linux/macOS.

Verified locally on Windows 11 / Python 3.14 / cp1250 console:
`instinct-cli.py status` now prints all 155 instincts from
`~/.claude/homunculus/instincts/` without crashing.
@georgeradcj georgeradcj force-pushed the fix/instinct-cli-md-utf8 branch from 54bb6be to 77ca23c Compare April 8, 2026 18:07
@georgeradcj georgeradcj changed the title fix(continuous-learning-v2): load .md instincts and handle UTF-8 on Windows fix(continuous-learning-v2): reconfigure stdout to UTF-8 on Windows Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant