Skip to content

Commit 47d0b78

Browse files
gpsheadclaude
andcommitted
Expand motivation with answers to common questions
Added two new sections to address questions raised at core team sprint: 1. "Why Exception Groups Need Timestamps" - Explains that while exception groups are conceptually unrelated, in practice they have important temporal relationships for debugging causality, performance analysis, and correlation with external observability tools. 2. "Why Not Use .add_note() When Catching?" - Details six key drawbacks of using add_note() instead: not all exceptions are caught, timing accuracy issues, inconsistent application, performance overhead, complexity burden, and loss of original timing information. Key insight: When an exception occurs is intrinsic, immutable information that should be captured at the source, not added later by consumers. 🤖 Generated with Claude Code (https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 5a25936 commit 47d0b78

1 file changed

Lines changed: 65 additions & 0 deletions

File tree

peps/pep-08XX.rst

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,71 @@ Without timestamps, we would only know that four services failed, but not
138138
their temporal relationship, making root cause analysis significantly harder.
139139

140140

141+
Why Exception Groups Need Timestamps
142+
-------------------------------------
143+
144+
While exception groups are conceptually "unrelated" exceptions that happen to be
145+
raised together, in practice they often have important temporal relationships:
146+
147+
1. **Causality isn't always explicit**: When multiple services fail in sequence,
148+
one failure might trigger cascading failures in seemingly unrelated services.
149+
Without timestamps, these cascade patterns are invisible. For example, a
150+
database connection pool exhaustion might cause multiple "unrelated" query
151+
failures across different services.
152+
153+
2. **Concurrent doesn't mean simultaneous**: Tasks in an exception group may
154+
start concurrently but fail at very different times. A service that fails
155+
after 100ms versus one that fails after 5 seconds tells a different story
156+
about what went wrong - the first might be a validation error, the second
157+
a timeout.
158+
159+
3. **Debugging distributed systems**: In microservice architectures, exception
160+
groups often collect failures from multiple remote services. Timestamps allow
161+
correlation with external observability tools (logs, metrics, traces) that
162+
are essential for understanding the full picture.
163+
164+
4. **Performance analysis**: Even for "unrelated" exceptions, knowing their
165+
temporal distribution helps identify performance bottlenecks and timeout
166+
configurations that need adjustment.
167+
168+
169+
Why Not Use ``.add_note()`` When Catching?
170+
--------------------------------------------
171+
172+
A common question is why we don't simply use :pep:`678`'s ``.add_note()`` to add
173+
timestamps when exceptions are caught and grouped. This approach has several
174+
significant drawbacks:
175+
176+
1. **Not all exceptions are caught**: Exceptions that propagate to the top level
177+
or are logged directly never get the opportunity to have notes added. The
178+
timestamp of when an error occurred is lost forever.
179+
180+
2. **Timing accuracy**: Adding a note when catching introduces variable delay.
181+
The timestamp would reflect when the exception was caught and processed, not
182+
when it actually occurred. In async code with complex exception handling,
183+
this delay can be significant and misleading.
184+
185+
3. **Inconsistent application**: Relying on exception handlers to add timestamps
186+
means some exceptions get timestamps and others don't, depending on code
187+
paths. This inconsistency makes debugging harder, not easier.
188+
189+
4. **Performance overhead**: Creating note strings for every caught exception
190+
adds overhead even when timestamps aren't being displayed. With the proposed
191+
approach, formatting only happens when tracebacks are rendered.
192+
193+
5. **Complexity burden**: Every exception handler that wants timing information
194+
would need to remember to add notes. This is error-prone and adds boilerplate
195+
to exception handling code.
196+
197+
6. **Lost original timing**: By the time an exception is caught, the original
198+
failure moment is lost. In retry loops or complex error handling, the catch
199+
point might be seconds or minutes after the actual error.
200+
201+
The key insight is that **when** an exception is created is intrinsic, immutable
202+
information about that exception - just like its type and message. This information
203+
should be captured at the source, not added later by consumers.
204+
205+
141206
Rationale
142207
=========
143208

0 commit comments

Comments
 (0)