Skip to content

fix: hardening fixes from full-codebase review#25

Open
KristianP26 wants to merge 21 commits into
DavidHruby1:mainfrom
KristianP26:fix/codebase-review-fixes
Open

fix: hardening fixes from full-codebase review#25
KristianP26 wants to merge 21 commits into
DavidHruby1:mainfrom
KristianP26:fix/codebase-review-fixes

Conversation

@KristianP26

@KristianP26 KristianP26 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Summary

A batch of correctness, security, and responsiveness fixes from a full review of the codebase. Each change is an isolated commit.

Security

  • Custom headers now stored via DPAPI instead of plaintext settings.ini (they routinely carry tokens such as X-Api-Key). Stale plaintext values from older versions are purged on the next save so they cannot shadow the encrypted ones.
  • Startup force-purges lingering plaintext secrets — new has_plaintext_secrets() detects secret keys an older version left in settings.ini and forces a save at startup (values are preserved, just re-persisted via DPAPI), so the purge no longer waits for the first manual settings save.
  • API key fields no longer auto-reveal on focus. They stay masked; a trailing eye toggle reveals the value only on demand.

Correctness

  • Settings dialog validates on every accept path — previously a direct accept() call bypassed both validation and the startup-registration sync.
  • Disabling Screamer mid-recording now discards the recording instead of transcribing it and typing the result into the focused window.
  • Disabling Screamer mid-processing now cancels the pipeline — the worker checks the cancel event before each step, so nothing gets typed after the user hits Disable.
  • Late calibration results are dropped when the settings dialog closesdone() disconnects the result slots before waiting on the calibration thread, so a slow calibration can no longer deliver into a closing dialog.
  • Frozen builds read .env next to the exe — apps launched from the Run key inherit an unpredictable cwd, so the file shipped next to the binary was silently missed.
  • Groq rewrite ceiling raised 1024 → 4096 tokens — dictations over ~2700 chars always hit finish_reason=length and silently lost their cleanup.

Resource & responsiveness

  • Worker threads are released after each dictation (finisheddeleteLater); previously one QThread object leaked per recording.
  • RMS calibration runs off the UI thread — the 2-second blocking recording froze the whole settings dialog.
  • Text injection is one batched SendInput call instead of two syscalls per UTF-16 unit: faster for long transcripts and atomic with respect to concurrent user keystrokes.
  • Startup no longer rewrites settings.ini/keys.enc when the .env import changed nothing (unless plaintext secrets need purging, see above).

Minor

  • Reveal-toggle eye icon follows the palette text color — visible on dark themes (was painted with the default black pen).
  • Documented partial-SendInput semantics: a failed batch means a prefix of the text was already typed; there is no rollback.
  • Documented why a primary NO_SPEECH result intentionally skips the STT fallback.
  • Direct config.stt_language access in rewrite.py (defensive getattr removed).
  • Calibration test is now deterministic (QSignalSpy + release event instead of a processEvents/sleep poll loop).
  • Implementation docs updated to match.

Test plan

  • python -m compileall src/ tests/ clean.
  • Full suite (python -m unittest discover -s tests, offscreen Qt, Windows): 105 tests pass, including new coverage for the cap scaling, UTF-16 splitting, discard-on-disable, cancel-on-disable-while-processing, startup-save skip, startup plaintext-secret purge, frozen .env path, header encryption + stale-ini purge, accept-path validation, password-field toggle, threaded calibration, and dropped late calibration results on dialog close.
  • Manual smoke on Windows: batched SendInput types correctly (python -m src.injector), python -m src.config DPAPI roundtrip OK.

The 1024-token ceiling silently dropped rewrites of dictations over
roughly 2700 characters (finish_reason=length discards the result).
Scale the guard up to 4096 so long dictations still get cleaned up.
Per-character injection issued two syscalls per UTF-16 unit, which is
slow for long transcripts and lets concurrent user keystrokes interleave
mid-text. Build one INPUT array and inject it atomically; also lift
_utf16_units to module level so the splitting logic is unit-testable.
Each dictation parented a new _WorkerThread to the tray app without ever
deleting it, leaking one QThread object per recording for the process
lifetime. Connect finished to deleteLater per the Qt ownership pattern.
Unchecking Enabled during a recording ran the full pipeline and typed
the transcription into the focused window. The user intent is to stop,
so cancel the recording and return to idle instead.
Every launch rewrote settings.ini and keys.enc even when the .env
import was a no-op. Compare against the loaded config and persist only
when the import actually changed something.
A frozen exe launched from the HKCU Run key inherits an unpredictable
working directory, so the cwd-based .env lookup silently missed the
file shipped next to the binary. Resolve it relative to the executable
when frozen; dev runs keep the cwd behaviour.
Custom headers routinely carry credentials (X-Api-Key, proxy tokens)
but were persisted plaintext in settings.ini while API keys went
through DPAPI. Treat all four header fields as secrets, and purge any
plaintext value an older version wrote so it cannot shadow the
encrypted one on load. Non-Windows dev runs lose header persistence,
matching the existing behaviour for API keys.
Only the OK button went through validation; a direct accept() call
(keyboard default, programmatic close) skipped both validation and the
startup-registration sync. Fold the checks into accept() itself.
Clicking into a key field exposed the secret immediately — an easy slip
during screen shares. Keep the field masked and add a checkable
trailing eye action that reveals the value only on demand.
Calibration recorded two seconds of audio synchronously on the Qt main
thread, freezing the whole dialog. Move the measurement into a small
QThread, disable the button while it runs, and have done() wait for an
in-flight run so the thread never outlives the dialog.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant