fix: add logging to authentication errors in controller and router#812
fix: add logging to authentication errors in controller and router#812mangelajo wants to merge 6 commits into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds structured error logging on authentication failures across ChangesAuthentication Error Logging and Status Fix
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
controller/internal/service/controller_service.go (1)
527-528: ⚡ Quick winCentralize auth-failure logging in the auth helper methods.
The same auth-failure log block is repeated in multiple RPC handlers. Moving this into
authenticateClient/authenticateExporterwill keep behavior consistent and reduce drift when new handlers are added.Refactor sketch
func (s *ControllerService) authenticateClient(ctx context.Context) (*jumpstarterdevv1alpha1.Client, error) { - return oidc.VerifyClientObjectToken( + logger := log.FromContext(ctx) + jclient, err := oidc.VerifyClientObjectToken( ctx, s.Authn, s.Authz, s.Attr, s.Client, ) + if err != nil { + logger.Info("unable to authenticate client", "error", err.Error()) + return nil, err + } + return jclient, nil }- logger := log.FromContext(ctx) - client, err := s.authenticateClient(ctx) - if err != nil { - logger.Info("unable to authenticate client", "error", err.Error()) - return nil, err - } + client, err := s.authenticateClient(ctx) + if err != nil { + return nil, err + }Also applies to: 902-907, 984-989, 1041-1046, 1075-1080
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controller/internal/service/controller_service.go` around lines 527 - 528, The auth-failure logging at lines 527-528 is duplicated across multiple RPC handlers (also appearing at 902-907, 984-989, 1041-1046, 1075-1080). Move the logging logic from the error-handling blocks in each RPC handler into the authenticateClient and authenticateExporter helper methods themselves. This way, when these helper methods return an error, they will already include the appropriate log statement, eliminating the need to repeat the logger.Info call with "unable to authenticate" messages in each caller. Update all RPC handlers to remove their redundant logging blocks and rely on the centralized logging within the auth helper methods.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@controller/internal/service/controller_service.go`:
- Around line 527-528: The auth-failure logging at lines 527-528 is duplicated
across multiple RPC handlers (also appearing at 902-907, 984-989, 1041-1046,
1075-1080). Move the logging logic from the error-handling blocks in each RPC
handler into the authenticateClient and authenticateExporter helper methods
themselves. This way, when these helper methods return an error, they will
already include the appropriate log statement, eliminating the need to repeat
the logger.Info call with "unable to authenticate" messages in each caller.
Update all RPC handlers to remove their redundant logging blocks and rely on the
centralized logging within the auth helper methods.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f8053213-1afd-4b9c-872e-dac95c895fe8
📒 Files selected for processing (4)
controller/internal/service/auth/auth.gocontroller/internal/service/controller_service.gocontroller/internal/service/router_service.gopython/packages/jumpstarter/jumpstarter/exporter/auth.py
|
overall:
|
06545bc to
91e0171
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controller/internal/service/controller_service.go`:
- Around line 248-249: Remove the duplicate logger.Error calls that log
authentication failures at the controller service layer, as these failures are
already being logged by the authenticateClient and authenticateExporter helper
functions. For each occurrence where you have a logger.Error call with messages
like "exporter authentication failed", "client authentication failed", etc.,
followed by a return statement with the error, simply delete the logger.Error
line and keep only the return statement. This applies to the authentication
failure handling in all RPC methods that call these authentication helpers
(authenticateClient and authenticateExporter), eliminating duplicate noisy log
entries while maintaining error propagation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d89e5e1b-68c0-4c8e-8d61-5c469f9aa35c
📒 Files selected for processing (4)
controller/internal/service/auth/auth.gocontroller/internal/service/controller_service.gocontroller/internal/service/router_service.gopython/packages/jumpstarter/jumpstarter/exporter/auth.py
✅ Files skipped from review due to trivial changes (1)
- python/packages/jumpstarter/jumpstarter/exporter/auth.py
🚧 Files skipped from review as they are similar to previous changes (2)
- controller/internal/service/router_service.go
- controller/internal/service/auth/auth.go
91e0171 to
ef847c3
Compare
ef847c3 to
7a721f7
Compare
Log authentication failures across the controller service (Listen, GetLease, RequestLease, ReleaseLease, ListLeases, Register, Unregister, Status, Dial), the Auth helpers (AuthClient, AuthExporter) which cover all ClientService call sites, the router service, and the Python exporter's PassphraseInterceptor. The peer IP address is extracted from the gRPC context and included in every auth log entry. For ClientService paths the peer is attached via peerAddr(); for ControllerService unary handlers logContext() is called at the start of each method so the peer propagates into all subsequent log entries including the auth failure. Log level: Info (not Error) is used intentionally, following the convention established by kube-apiserver. Authentication rejections are expected adversarial events, not controller bugs. Using Error would conflate "something is broken in the controller" with "a caller sent a bad token", muddying alerts. The controller-runtime/logr interface has no native Warn level (the backend is zapr, which maps logr.Info -> zap INFO and logr.Error -> zap ERROR with no WARN in between), so Info at the default V=0 verbosity is the correct choice for security-relevant but operationally normal events. Also fix the router JWT validation gRPC status code from codes.InvalidArgument to codes.Unauthenticated, which is the semantically correct code for a failed authentication. Fixes #811
7a721f7 to
d2393c4
Compare
- Remove duplicate auth-failure log lines from controller_service.go; auth helpers (AuthClient/AuthExporter) already log at the single ownership point, so per-RPC re-logging was noisy and redundant. - Remove redundant ctx = logContext(ctx) calls from GetLease, RequestLease, ReleaseLease, ListLeases, and Register; the gRPC unary interceptor already enriches every unary context, and the stream interceptor (wrappedStream) does the same for streaming RPCs. - Unify peer format: logContext now strips the port via net.SplitHostPort, matching the format used by peerAddr in auth.go, so the "peer" key is consistently IP-only across all log sites. - router_service.go Stream: call logContext before authenticate so the peer IP is available in the router auth-failure log entry. - auth.go peerAddr: return "unknown" on SplitHostPort failure (e.g. Unix domain sockets) to avoid leaking internal transport paths.
- Export PeerAddr from auth package to eliminate duplicate peer-address extraction logic in logContext (controller_service.go now delegates to auth.PeerAddr instead of duplicating the net.SplitHostPort logic) - Remove the stale 'unable to authenticate exporter' log in Status that was inconsistent with all other methods; auth logging already happens inside AuthExporter via the auth package - Add TestRouterAuthenticateNoTokenLeak to verify that a bogus JWT token value is never emitted in log output, preventing credential leakage
|
Addressed all remaining review comments in the latest commit ( Deduplication of peer address logic (
Standardised log messages (
Router
Duplicate controller-level auth logs (
Test for JWT token not leaking in logs (
Unix socket comment / network-only clarification (
|
|
@raballew adding tests to verify all this logging behavior properly. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controller/internal/service/controller_service_test.go`:
- Around line 1804-1812: The test calls authenticate(ctx) directly which does
not perform logging, but the actual logging that could leak the token happens in
RouterService.Stream, so this test does not exercise the real code path that
could cause credential leaks. Refactor the test to route through the Stream
method (or explicitly call the same log statement used in Stream) instead of
calling authenticate directly, so the test properly validates that the actual
logging path sanitizes the token and prevents it from appearing in log output.
- Around line 1792-1795: The test modifies global logger state by calling
logf.SetLogger() but does not restore the original logger after the test
completes, causing cross-test contamination. Before calling logf.SetLogger() to
set the new logger with the buffer, save the current logger by calling
logf.Log.GetLogger() or equivalent to capture the original logger instance. Then
use a defer statement immediately after logf.SetLogger() to restore the original
logger when the test function exits, ensuring subsequent tests are not affected
by the modified global logger state.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 9204b4b1-84bb-4aaf-8823-912f22c36049
📒 Files selected for processing (3)
controller/internal/service/auth/auth.gocontroller/internal/service/controller_service.gocontroller/internal/service/controller_service_test.go
Go (controller/internal/service/auth/auth_test.go): - TestPeerAddr: no peer, host:port, IPv6, bare address, empty, nil Addr, Unix socket — all return expected values without panics - TestAuthClient_TokenVerificationFailure_LogsPeerAndError: verifies log contains 'client authentication failed', peer IP, and error text - TestAuthClient_NamespaceMismatch_LogsClientNameAndPeer: verifies log includes client name, peer, and PermissionDenied - TestAuthClient_Success_NoAuthFailureLog: no auth-failure log on success - TestAuthExporter_*: symmetric tests for exporter auth path - TestAuthClient_ErrorLogIncludesNoDuplicateTokenInMessage: no raw Bearer prefix in logs Go (controller/internal/service/controller_service_test.go): - TestRouterAuthenticateNoTokenLeak: now exercises the actual Stream logging path (logContext + logger.Info) with peer context, not just authenticate() alone - TestRouterAuthenticateReturnsUnauthenticated: codes.Unauthenticated - TestRouterAuthenticateMissingMetadata: error on missing metadata - TestLogContext_WithPeer_EnrichesContextWithPeerIP: peer IP in log - TestLogContext_WithoutPeer_ReturnsUnchangedContext: no peer key - TestLogContext_IPv6Peer_StripsPort: IPv6 support - TestLogContext_NilAddr_ReturnsUnchangedContext: no panic on nil - TestLogContext_UnixSocket_ReturnsUnknownPeer: path not leaked - withCapturedLog helper uses context-based logger (not global state) to prevent cross-test contamination Python (session_test.py): - test_serve_tcp_passphrase_rejected_logs_warning: verifies WARNING log with method name on wrong passphrase - test_serve_tcp_passphrase_missing_logs_warning: same for missing - test_serve_tcp_passphrase_correct_no_warning_log: no warning on success Bugfix: PeerAddr now handles nil p.Addr without panicking.
|
Added comprehensive logging behavior tests in commit Go —
|
| Test | What it verifies |
|---|---|
TestPeerAddr (5 subtests) |
No peer → "unknown", host:port → IP only, IPv6 → IP only, bare address → "unknown", empty → "unknown" |
TestPeerAddr_NilAddr |
No panic when p.Addr is nil (bugfix: added nil guard) |
TestPeerAddr_UnixSocket |
Unix socket path → "unknown" (not leaked) |
TestAuthClient_TokenVerificationFailure_LogsPeerAndError |
Log contains "client authentication failed", peer IP, and error text |
TestAuthClient_NamespaceMismatch_LogsClientNameAndPeer |
Log includes client name, peer IP, "namespace mismatch", returns PermissionDenied |
TestAuthClient_Success_NoAuthFailureLog |
No "authentication failed" log on successful auth |
TestAuthExporter_* (3 tests) |
Symmetric tests for exporter auth path |
TestAuthClient_ErrorLogIncludesNoDuplicateTokenInMessage |
No raw "Bearer" prefix in log output |
Go — controller/internal/service/controller_service_test.go (8 new tests + fixes)
| Test | What it verifies |
|---|---|
TestRouterAuthenticateNoTokenLeak |
Reworked: now exercises the full Stream logging path (logContext + logger.Info("router authentication failed")) with peer context — not just authenticate() alone |
TestRouterAuthenticateReturnsUnauthenticated |
Invalid JWT → codes.Unauthenticated |
TestRouterAuthenticateMissingMetadata |
Missing metadata → error |
TestLogContext_WithPeer_EnrichesContextWithPeerIP |
Peer IP appears in log, port is stripped |
TestLogContext_WithoutPeer_ReturnsUnchangedContext |
No "peer" key in log |
TestLogContext_IPv6Peer_StripsPort |
IPv6 address correctly extracted |
TestLogContext_NilAddr_ReturnsUnchangedContext |
No panic, no "peer" key |
TestLogContext_UnixSocket_ReturnsUnknownPeer |
Peer is "unknown", socket path not leaked |
Test isolation fix: withCapturedLog helper now uses a context-based logger instead of mutating global logf.Log, preventing cross-test contamination.
Python — session_test.py (3 new tests)
| Test | What it verifies |
|---|---|
test_serve_tcp_passphrase_rejected_logs_warning |
Wrong passphrase → WARNING log with "GetReport" method name |
test_serve_tcp_passphrase_missing_logs_warning |
Missing passphrase → WARNING log with method name |
test_serve_tcp_passphrase_correct_no_warning_log |
Correct passphrase → no auth failure warning |
Bugfix discovered by tests
PeerAddr now guards against p.Addr == nil (previously would panic with nil pointer dereference).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controller/internal/service/controller_service_test.go`:
- Around line 1993-2009: The
TestControllerServiceAuthMethodsDoNotLogAuthFailures function currently only
documents the intent with a t.Log statement but performs no actual assertions,
so it always passes and won't catch regressions. Refactor this test to actively
verify the invariant by scanning the source code of the listed RPC methods
(Register, Unregister, ReportStatus, Listen, Status, Dial, GetLease,
RequestLease, ReleaseLease, ListLeases) and asserting that they do not contain
duplicate auth-failure log statements. Use Go's AST parsing or source code
scanning to inspect the method bodies and ensure that any authentication-related
logging is not duplicated beyond what the auth package handles.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 989844bd-acfb-453f-aadc-ee01aa788a57
📒 Files selected for processing (4)
controller/internal/service/auth/auth.gocontroller/internal/service/auth/auth_test.gocontroller/internal/service/controller_service_test.gopython/packages/jumpstarter/jumpstarter/exporter/session_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- controller/internal/service/auth/auth.go
- Extract peerAddrUnknown constant in auth_test.go (goconst) - Break long line in session_test.py (ruff E501: 121 > 120)
…g test - Fix typos CI: 'unparseable' -> 'unparsable' in auth_test.go comment - Make TestControllerServiceAuthMethodsDoNotLogAuthFailures actually scan controller_service.go source for forbidden auth-failure strings instead of being a documentation-only no-op
Fixes #811
Summary
Authentication failures were silently dropped in several places — no log output was produced, making it very hard to diagnose connection/auth issues.
Controller (Go)
controller/internal/service/controller_service.goAdded
logger.Info("unable to authenticate ...")before returning auth errors in 5 methods that were previously silent:Listen(exporter auth)GetLease(client auth)RequestLease(client auth)ReleaseLease(client auth)ListLeases(client auth)controller/internal/service/auth/auth.goAdded logging directly into the
AuthClientandAuthExporterhelpers. This covers allClientServicecall sites at once (8 total). Both token verification failures and namespace mismatch errors are now logged.controller/internal/service/router_service.goFixed the gRPC status code for invalid JWT tokens:
codes.InvalidArgument→codes.Unauthenticated(semantically correct for a failed authentication).Python Exporter
python/packages/jumpstarter/jumpstarter/exporter/auth.pyAdded
logger.warning(...)inPassphraseInterceptor.intercept_servicewhen an RPC arrives with an invalid or missing passphrase. Theloggerwas already defined at module level but never used.