feat(library-config): otel process context reader#2176
Conversation
Add `read()`, `threadlocal_attribute_key_map()`, and `read_threadlocal_attribute_key_map()` to `otel_process_ctx::linux`, along with the `find_otel_mapping()` / `is_named_otel_mapping()` / `parse_mapping_start()`.
2887d0b to
cfe279d
Compare
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
📚 Documentation Check Results📦
|
🔒 Cargo Deny Results📦
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2887d0b569
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let payload = unsafe { std::slice::from_raw_parts(payload_ptr, payload_size as usize) }; | ||
| let context = ProcessContext::decode(payload)?; |
There was a problem hiding this comment.
Synchronize before decoding the publisher's payload
When read() runs concurrently with another thread calling publish()/update(), the timestamp check does not keep the old payload allocation alive: after the first published_at load, ProcessContextHandle::update can assign self.payload = payload, dropping the Vec that payload_ptr still references, and this code then builds a slice and lets prost read from a dangling pointer. The final timestamp comparison happens only after the unsafe read, so it cannot prevent a crash/UB; the same-process reader needs to coordinate with the publisher or otherwise copy from memory that cannot be freed during decode.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
yeah... This file seems to want to update some sort of seqlock, but there are several problems with it, one of these is this reclamation problem; a seqlock can protects only static region. Also a seqlock depends on a counter being odd to detect in-progress changes.
There was a problem hiding this comment.
Likely the solution involves in-process readers using the write lock.
There was a problem hiding this comment.
lso a seqlock depends on a counter being odd to detect in-progress changes.
I think the odd thing is because of multiple writers. If you have one writer only, and multiple reader, a monotonic value should be enough to detect torn reads (which I believe is the model for the process context).
one of these is this reclamation problem; a seqlock can protects only static region
I think I don't understand what you mean by that?
There was a problem hiding this comment.
I think the odd thing is because of multiple writers.
It's not, it's to detect that a modification is progress, so the read may be give torn/inconsistent data. Assume there is no reordering:
- update data
- ++seq
and the reader does
- read seq
- read data
- re-read seq to confirm it's the same
then the reader may read partially updated data and see the same seq.
But anyway, I was wrong about this, because the writer uses 0 to signal the update in progress.
one of these is this reclamation problem; a seqlock can protects only static region
What I mean is that the reader may fetch a pointer to the payload, and by the time it reads it, it's been freed already. This is a problem if the memory region has been unmapped (segfault).
There was a problem hiding this comment.
So I think the wrong publishing protocol was chosen for this. What the seqlock is protecting is a single pointer, so you could just update the pointer atomically with release-acquire (more or less, it also has a size, but that's not an obstacle because amd64/arm64 support 16-byte atomics, or the size could be moved to the beginning of the pointed to region). This doesn't eliminate the reclamation problem -- you'd still need something else to prevent in progress readers from segfaulting, but it does show that the seqlock buys nothing here.
There was a problem hiding this comment.
Ah, I see, thanks for the clarification for the odd counter part. I didn't remember that correctly.
I guess in the process context case the information is just split across two different fields.
What I mean is that the reader may fetch a pointer to the payload, and by the time it reads it, it's been freed already. This is a problem if the memory region has been unmapped (segfault).
@ivoanjo I don't remember all the details from the spec, but I think the model here is to map a page at init time and then never unmap it (beyond in-place update)? Of course you can't enforce that from the reader side, you have to rely on the writer to be compliant.
But it seems that process_vm_readv() returns an error, and doesn't cause SEGFAULT, for invalid/unmapped addresses (not clear from the doc, but so says my clanker). So I won't be worried about unmapping the context page.
However, if the payload is being modified, I think you're right. I don't see how you can avoid:
- reader see a consistent context and read it locally
- writer starts and completes an update by re-allocating the payload, causing the previous one to be dropped if heap-allocated, or overwritten if stored in the mapping.
- reader starts to dereference the payload ptr
This case might be even worse because you won't get EINVAL, since the memory will most likely still be mapped (even in the heap), just potential gibberish 🤔
There was a problem hiding this comment.
I guess one fix would be to restrict the reader to always use the same mapping for the payload, and include in the bracket as part of the "writing". The pointer would effectively be mostly useless, and the payload would be a static part (beside the length) of the whole context.
There was a problem hiding this comment.
I think the model here is to map a page at init time and then never unmap it (beyond in-place update)?
This is the case for the header. The problem is the payload.
But it seems that process_vm_readv() returns an error, and doesn't cause SEGFAULT, for invalid/unmapped addresses
Yes, out-of-process reads don't have this problem. A more difficult question, with process_vm_readv doing plain memcpy page-by-page and having the payload and header in different pages is whether they participate correctly in the protocol, which something that needs to be seen architecture-by-architecture, but probably it's not a big obstacle.
However, if the payload is being modified, I think you're right. I don't see how you can avoid:
The way this is usually avoided for in-process readers is to use specific reclamation strategies like RCU, hazard pointers, refcounting, etc... or use read-write locks.
I think that reading gibberish, even in-process, is not that bad, because it can be detected by the seqlock protocol (assuming you follow the pointer and read all your data before re-checking the sequence number). It's just the if the memory has been unmapped that it becomes a problem.
There was a problem hiding this comment.
We've already discussed in the channel, but looping back on this -- my suggestion is to either have an out-of-band mechanism for in-process reads synchronization OR use a process_vm_readv or equivalent reader.
| name.starts_with("/memfd:OTEL_CTX") | ||
| || name.starts_with("[anon_shmem:OTEL_CTX]") | ||
| || name.starts_with("[anon:OTEL_CTX]") |
There was a problem hiding this comment.
Match only the exact OTEL_CTX mapping name
Because these checks use starts_with, a process that also has a mapping such as /memfd:OTEL_CTX_BACKUP or [anon:OTEL_CTX_old] can be selected before the real /memfd:OTEL_CTX entry, since /proc/self/maps is ordered by address rather than by mapping name. In that case read() will return an invalid-signature error (or read the wrong context) even though a valid OTEL context mapping exists later in the maps file; the name match should be exact, with only the kernel's separate (deleted) token handled as metadata.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Why would there be mappings with names like "/memfd:OTEL_CTX_BACKUP" or "[anon:OTEL_CTX_old]"? Yes, sure, it's technically true but... what are we supposed to do about this?
There was a problem hiding this comment.
Wasn't there a suggestion on adding several of these mappings for PHP to account for it processing several service/env? Ultimately we didn't do it, but I think that's indeed the idea of the spec, that you could have several.
This implements literally what the spec says: https://github.com/open-telemetry/opentelemetry-specification/blob/14137d0022f98bb59392f82255a70f3d9c2a7d4c/oteps/profiles/4719-process-ctx.md?plain=1#L164
Locate mapping: Parse
/proc/<pid>/mapsand search for entries with name starting with[anon_shmem:OTEL_CTX],[anon:OTEL_CTX]or/memfd:OTEL_CTX
though I think it should say [anon_shmem:OTEL_CTX and [anon:OTEL_CTX. Or maybe it just wanted to allow the (deleted) afterwards
There was a problem hiding this comment.
Yes, it's for the (deleted) stuff. It's not for multiple mappings to find.
There was a problem hiding this comment.
What @bwoebi said -- the process context is expected to be a singleton for a process, and anything that's not a singleton for a process should not go in there.
E.g. if a process is 1:1 to a service, then service name goes in there (for instance, this is the plan for dd-trace-rb); if a process is somehow multi-tenant and supports multiple services (e.g. dd-trace-php), then it shouldn't go in there.
It's not supported by the spec publishing several OTEL_CTX memory locations -- readers are only looking for one so they'll either error or ignore the others.
|
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
…:Result -> io::Result
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
| fence(Ordering::SeqCst); | ||
| AtomicU64::from_ptr(addr_of_mut!((*header).monotonic_published_at_ns)) | ||
| (*header) | ||
| .monotonic_published_at_ns | ||
| .store(published_at_ns, Ordering::Relaxed); |
There was a problem hiding this comment.
I think the fence is unnecessary if I change the store to use Ordering::SeqCst, yeah? In the previous state, it did a SeqCst fence with a Relaxed store, I didn't change it, I just changed to use the AtomicU64 natively instead of through a ptr.
There was a problem hiding this comment.
I've read a bunch of material which helped me understand some other cases better but... not precisely this one. If anyone can educate me here, I'd appreciate understanding how they'd be different and which one is desired!
There was a problem hiding this comment.
Yes, and seqcst is even too strong, it only needs release. If you're interested, you can read this paper on synchronization for the seqlock algorithm.
But there is a problem that makes reasoning purely about the C++ memory model here insufficient, and that is that we need to support out-of-process readers with process_vm_readv. This requires understanding both a) what happens architecturally on the writer and b) how process_vm_readv (which I think does just memcpy page by page) and access_remote_vm work exactly.
In any case, we only need to support amd64 and arm64, so just a Release store should be enough.
There was a problem hiding this comment.
My suggestion is -- let's go with the strongest barrier we have. Why? Because the process context is not performance-sensitive, and the cost of getting this detail wrong is high.
So even though we need barriers here for correctness, I don't think we should be trying to think about them in terms of performance. If we had a lock that would use a much stronger atomic and nobody would be mad if we grab a log once an hour so that's the mental model to go for here ;)
-- my 0.02c :D
There was a problem hiding this comment.
I think the fence is unnecessary if I change the store to use Ordering::SeqCst, yeah? In the previous state, it did a SeqCst fence with a Relaxed store, I didn't change it, I just changed to use the AtomicU64 natively instead of through a ptr.
Yes, I believe SeqCst store is more constraining than a SeqCst fence + a relaxed store. The difference is rather technical, but it's that in the latter case, the store isn't part of the total sequential consistent order. For all other orderings, a fence with ordering O followed (resp. preceded by) a relaxed store (resp. a relaxed load) is strictly equivalent to a store (resp. a load) with ordering O. The main usage for fences is that you can only use one for several atomic operations at once, or that you can use them conditionally. Here there's no strong motivation other than following closely the spec.
Yes, and seqcst is even too strong, it only needs release. If you're interested, you can read this paper on synchronization for the seqlock algorithm.
Yup. We used SeqCst mostly because:
- the process context publication isn't performance-critical (happens once or not very often)
- since the whole seqlock-like thing is fishy in Rust, we upgraded to the strongest ordering.
But there is a problem that makes reasoning purely about the C++ memory model here insufficient, and that is that we need to support out-of-process readers with process_vm_readv. This requires understanding both a) what happens architecturally on the writer and b) how process_vm_readv (which I think does just memcpy page by page) and access_remote_vm work exactly.
Yes, I personally have no idea what is the memory semantics of process_vm_readv, and in any case it can't be considered by the C++ memory model, which doesn't model OS-level operations like these. So we're a bit in best-effort places.
There was a problem hiding this comment.
Though, process_vm_readv can't have stronger requirements than direct memory accesses, I believe. So if we're direct-memory-access-safe, I don't see how we wouldn't be syscall-read-memory-safe.
There was a problem hiding this comment.
Yeah I think the way to reason about this is about constraining what the compiler does in such a way that it can't be anything else than what we want (even though the spec may not technically promise it...)
19c2470 to
d363e03
Compare
In this case, we could just make the pointer and the size atomic and access them with relaxed memory order. This would remove the technicality of the race being UB. But even without it, I don't think it's much of a problem. I'm pretty sure the exact same code would be generated. |
|
My original question still stands (as it's not written in the PR description - to be clear I haven't look at the PR code yet in details): what is the motivation for having a reader in libdatadog? |
I could be implemented somewhere else. I guess the same reason the writer is implemented in libdatadog -- to do it uniformly across tracers. And with the writer already implemented in it, and the reader being sort of implemented already for the tests, everything stays more cohesive. That said, I have no great objection to this being moved to dd-trace-php |
|
More precisely: You need a reader in (PHP) profiling to support using the profiler with another otel compatible tracer (other than our tracer) running in the same process and being able to access that tracers context data. libdatadog is the right place for this - it's not necessarily a pure PHP concern. |
|
In practice we're allocating a whole OS page to put this header in there, and the header is like 32 bytes. That leaves roughly 4,064 bytes (because we're talking about Linux here, and nearly everything is running 4096 pages or more on Linux) to put a serialized payload into. Why use a separate allocation for the payload? Nobody should use more than ~4k bytes for this purpose, methinks? And this utilizes otherwise wasted space. If the spec is flexible at this stage, instead of storing a ptr you can store a length for the usable portion of the page, which can vary e.g. 16 KiB on macOS if we ever port there. I think someone may have hinted at doing this in the discussion already, and the spec hints at an optimization, but does anyone (@ivoanjo?) have a problem mandating that? Because otherwise you can't really do a safe read, if the ptr has a separate lifetime. I suppose it could point to some other safe place like a static section of memory, but that doesn't really seem too useful compared to just using the memory page. Torn reads can be handled, but a separate lifetime complicates making that torn read safe for both threads and out-of-process readers. |
|
@morrisonlevi Regarding the allocation living elsewhere and not in the header, it's a pure matter of doing the simplest thing for the first implementation. The plan has indeed been to put it in the header as a next step. No other technical reason than TODO. I don't know why this is not mandated in the spec, or why you'd want a separate allocation. I'll leave that part to Ivo 🙂 For the correctness part, we've casually discussed with @scottgerring and @nsavoire yesterday. My current understanding is that having a separate allocation is OK because:
But most importantly, in any of those cases, you don't get a crash. From there, y you'll also be able to know because you check the |
|
@yannham so the plan is to use |
I think the spec was mainly written with an out-of-process, privileged reader in mind, typically an eBPF profiler. If you really need to read from the same process, I would be inclined to go with |
I don't think that would help. Yes, you could tell whether the memory was mapped at the point of the |
I would expect that if you write to a file pipe, the memory is copied somewhere, or double mapped, or whatever. In any case, however the kernel handles this, the important point is that neither |
In the memfd path, the header becomes immediately discoverable. Therefore, we cannot do an non-atomic write to monotonic_published_at_ns with value 0, as this could race with a reader that did an acquire read on monotonic_published_at_ns. This write is also unnecessary because the mapping comes zero-initialized, and we'd be writing a zero (so it would also _likely_ be harmless anyway, even if violating the memory model). Note that there can be no race in signature/version. These are only read if the published_at is nonzero, and the release-acquire relationship guarantees that their final non-zero values are visible if published_at is nonzero.
|
I've pushed a commit that uses |
| //! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719) | ||
| //! specification. |
There was a problem hiding this comment.
| //! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719) | |
| //! specification. | |
| //! context specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4719). |
| //! seq1 = seq.load(m_o_relaxed); | ||
| //! } while (seq0 & 1 || seq0 != seq1); | ||
| //! ... | ||
| //! } |
There was a problem hiding this comment.
I think this is a bit too low-level for a module-level documentation. I would remove the C-like code from here, at best give a short description, or just mention seqlock.
There was a problem hiding this comment.
To me, the most useful thing about documentation like this is the intuition about why the algorithm is correct / what invariants it upholds. The implementation (even in pseudocode) should fall out of the invariants, not the other way around.
| //! instead of even numbers. | ||
| //! We also forbid concurrent writers, and leave the reader retries to the | ||
| //! discretion of the caller. | ||
| //! We ignore the corner case where time returns 0. |
There was a problem hiding this comment.
I would keep the part about reading from syscall here, especially given the discussion on this PR around reading from the same process.
| /// Initial publication of the process context. Creates an appropriate memory mapping. | ||
| fn publish(payload: Vec<u8>) -> anyhow::Result<Self> { | ||
| fn publish(payload: Vec<u8>) -> io::Result<Self> { | ||
| let payload_size = payload |
There was a problem hiding this comment.
Nit: add a type annotation on payload_size to know when we read this line what conversion we're doing, and in which case it would fail.
| fence(Ordering::SeqCst); | ||
| AtomicU64::from_ptr(addr_of_mut!((*header).monotonic_published_at_ns)) | ||
| .store(published_at_ns, Ordering::Relaxed); | ||
| ptr::addr_of_mut!((*header).signature).write(*SIGNATURE); |
There was a problem hiding this comment.
Should we start by writing 0 to monotonic_published_at? Usually the memory is allocated zeroed, but I'm not sure we can rely on that in every environment possible. I don't really remember all the ins and outs of the naming part but we already use MAPPING_NAME in MemMapping::new(), so I wonder if in some cases the context could already be discoverable at this point.
There was a problem hiding this comment.
yes, on the mmap side, MAP_ANONYMOUS has this guarantee:
MAP_ANONYMOUS The mapping is not backed by any file; its contents are initialized to zero.
And on memfd_create + ftruncate, the ftruncate linux man page says:
If the file previously was larger than this size, the extra data is lost. If the file previously was shorter, it is extended, and the extended part reads as null bytes ('\0').
There was a problem hiding this comment.
Fair enough 👍 maybe adding a small comment saying that would be a nice plus for future code readers
|
|
||
| // The returned size is guaranteed to be larger or equal to the size of `MappingHeader`. | ||
| fn mapping_size() -> usize { | ||
| const fn mapping_size() -> usize { |
There was a problem hiding this comment.
A previous version was using a non-const implementation (using libc to get the page size typically). If not especially needed I would avoid making the const guarantee. I trust rustc to be smart enough to inline and constant-propagate the same way anyway, if it's for performance reason.
| /// Locates the OTEL_CTX mapping at construction. Call [`read`](Self::read) repeatedly to fetch | ||
| /// updated context data without re-parsing `/proc/self/maps`, as long as the process has not | ||
| /// forked. After a `fork()`, reads fail and a new reader must be constructed. | ||
| pub struct ProcessContextSelfReader { |
There was a problem hiding this comment.
Should the reader side go in a submodule? And even maybe the writer side as well? I feel like the use-case is different from the publisher, which is what SDK/tracers will ever need, and this source file starts to grow quite a bit.
|
|
||
| let published_at = header.monotonic_published_at_ns.load(Ordering::Acquire); | ||
| if published_at == 0 { | ||
| return Err(io::Error::new( |
There was a problem hiding this comment.
You are going to be the consumer of this API, but don't you want to be able to detect this error? This sounds like a non fatal one, as opposed to the fork issue, or other IO errors. One possibility would be to have a retry-loop version, or just return more structured errors.
| /// # Safety | ||
| /// | ||
| /// This function may only be called if it can be guaranteed that there are no in-process | ||
| /// readers, or at least that they will not be used after the call. |
There was a problem hiding this comment.
This should probably be mirrored in the FFI version's documentation.
| .store(0, Ordering::Relaxed); | ||
| } | ||
| fence(Ordering::Release); |
There was a problem hiding this comment.
Sounds equivalent and a bit simpler? Note that unfortunately I don't think this guarantees anything like the payload will be dropped after the store. Stuff can move up before a release operation (fence/store). The latter just guarantees that whatever happens-before the release operation, will happens-before a matching acquire store that will load 0. It makes no guarantee about what happens after the release operation. You'd need acquire, but that doesn't exist for stores. Maybe an acquire-swap would do.
| .store(0, Ordering::Relaxed); | |
| } | |
| fence(Ordering::Release); | |
| .store(0, Ordering::Release); | |
| } |
There was a problem hiding this comment.
The change here is a bit of a corner case, but what we want here is that these two stores are not reordered, that we store 0 before doing the drop. The atomic release prevents reordering with memory operations that are sequenced before it, not after. So for an atomic release to work, it would have to be on the drop, which is not possible (the whole thing is also of undefined anyway, because the drop is not doing any sort of atomic operation)
There was a problem hiding this comment.
The atomic release prevents reordering with memory operations that are sequenced before it, not after.
Yes, and that is my point. Here the drop happens after the release store. So you don't get any guarantee regarding the respective order of the drop and the release store from a reader that acquire-load the value 0. As far as the C++20 memory model is concerned, the drop could look like it happens before the store from the reader POV. Release doesn't prevent that re-ordering.
There was a problem hiding this comment.
If the reader loads the value 0 it will not even attempt to load payload. The risk is that it doesn't read the value 0 and then attempt to read a dropped value. Note that this can still happen (it's a variation on the torn read case), it's just less likely with this change.
There was a problem hiding this comment.
Or are you saying that it's a "best-effort-although-technically-incorrect" approach, where you reason as if the drop was performing atomic operations somehow?
There was a problem hiding this comment.
So, yes, you're generally right that this doesn't provide any formal guarantee of correctness. For the synchronization to be complete, you'd need to read-acquire on the dropped value. This is more an attempt of having of the compiler generate a StoreStore barrier on the writer side so that it's less likely that another cpu sees value nonzero and then a dropped payload
There was a problem hiding this comment.
I see. If it's a safety net, what do you think about using a stronger SeqCst barrier? On ARM for example I think it's always compiled to a DMB ISH, while release fences might be compiled to something slightly weaker, such as DMB ISHLD; DMB ISHST. If it's a best-effort/out of the memory model thing and not performance sensitive... let's be as strong as we can ? 🤷
There was a problem hiding this comment.
Was going to say the same
You'd need acquire, but that doesn't exist for stores
But SeqCst does
danielsn
left a comment
There was a problem hiding this comment.
As a broader comment, this whole file seems large. I wonder if it would benefit from being split into modules
| //! seq1 = seq.load(m_o_relaxed); | ||
| //! } while (seq0 & 1 || seq0 != seq1); | ||
| //! ... | ||
| //! } |
There was a problem hiding this comment.
To me, the most useful thing about documentation like this is the intuition about why the algorithm is correct / what invariants it upholds. The implementation (even in pseudocode) should fall out of the invariants, not the other way around.
| //! instead of even numbers. | ||
| //! We also forbid concurrent writers, and leave the reader retries to the | ||
| //! discretion of the caller. | ||
| //! We ignore the corner case where time returns 0. |
| //! | ||
| //! Although we instead use 0 to signal the writer is progress and a timestamp | ||
| //! instead of even numbers. | ||
| //! We also forbid concurrent writers, and leave the reader retries to the |
| //! seq.store(seq0 + 2, m_o_release); | ||
| //! } | ||
| //! | ||
| //! Although we instead use 0 to signal the writer is progress and a timestamp |
There was a problem hiding this comment.
Instead of magic constants, can we use defined enums?
| //! } | ||
| //! | ||
| //! Although we instead use 0 to signal the writer is progress and a timestamp | ||
| //! instead of even numbers. |
There was a problem hiding this comment.
Why did the original algorithm constrain to even numbers? Do timestamps uphold the same invariant?
| .store(0, Ordering::Relaxed); | ||
| } | ||
| fence(Ordering::Release); |
There was a problem hiding this comment.
Was going to say the same
You'd need acquire, but that doesn't exist for stores
But SeqCst does
| /// | ||
| /// # Safety | ||
| /// | ||
| /// This function may only be called if it can be guaranteed that there are no in-process |
There was a problem hiding this comment.
Is there any way to check this, even if just on debug builds?
| .collect() | ||
| } | ||
|
|
||
| fn find_attr<'a>(attrs: &'a [KeyValue], key: &str) -> Option<&'a AnyValue> { |
There was a problem hiding this comment.
Are there few enough attrs that a linear scan is fine?
| fn is_named_otel_mapping(line: &str) -> bool { | ||
| let trimmed = line.trim_end(); | ||
|
|
||
| // The name of the mapping is the 6th column. The separator changes (both ' ' and '\t') |
There was a problem hiding this comment.
Is this format documented anywhere, or is this reverse engineering?
|
|
||
| // pairs with the first release fence on update() to ensure that, if we read data | ||
| // updated after the initial published time, we at least see the published | ||
| // time being set to 0 in the next load of the published time (or we could |
There was a problem hiding this comment.
Is it allowed to go back from a published time to 0?
monotonic_published_at_ns being zero is reserved to indicate the context is being mutated and is not yet ready for reading.
3379921 to
99e3aaa
Compare
yannham
left a comment
There was a problem hiding this comment.
I wouldn't feel comfortable reviewing a PR that touches to many moving parts of the process context. We somehow arrived to an agreement on the initial version of the reader and minor publish changes. Would that be possible to revert to that state, address the last review comment one way or another, and keep other changes as separate PRs with proper motivation?
ivoanjo
left a comment
There was a problem hiding this comment.
I'll +1 on splitting this PR. Also, if we're making the reader directly read from the output of the writer, we can move it to a separate file as at that point they're very independent.
Add
read(),threadlocal_attribute_key_map(), andread_threadlocal_attribute_key_map()tootel_process_ctx::linux, along with thefind_otel_mapping()/is_named_otel_mapping()/parse_mapping_start().What does this PR do?
A brief description of the change being made with this pull request.
Motivation
What inspired you to submit this pull request?
Additional Notes
Anything else we should know when reviewing?
How to test the change?
Describe here in detail how the change can be validated.
BREAKING CHANGE: unpublish becomes unsafe.