feat(library-config): otel process context reader by cataphract · Pull Request #2176 · DataDog/libdatadog

cataphract · 2026-06-29T15:41:13Z

Add read(), threadlocal_attribute_key_map(), and read_threadlocal_attribute_key_map() to otel_process_ctx::linux, along with the find_otel_mapping() / is_named_otel_mapping() / parse_mapping_start().

What does this PR do?

A brief description of the change being made with this pull request.

Motivation

What inspired you to submit this pull request?

Additional Notes

Anything else we should know when reviewing?

How to test the change?

Describe here in detail how the change can be validated.

BREAKING CHANGE: unpublish becomes unsafe.

Add `read()`, `threadlocal_attribute_key_map()`, and `read_threadlocal_attribute_key_map()` to `otel_process_ctx::linux`, along with the `find_otel_mapping()` / `is_named_otel_mapping()` / `parse_mapping_start()`.

github-actions · 2026-06-29T15:44:55Z

Clippy Allow Annotation Report

Comparing clippy allow annotations between branches:

Base Branch: origin/main
PR Branch: origin/levi/otel-thread-context

Summary by Rule

Rule	Base Branch	PR Branch	Change

Annotation Counts by File

File	Base Branch	PR Branch	Change

Annotation Stats by Crate

Crate	Base Branch	PR Branch	Change
`clippy-annotation-reporter`	5	5	No change (0%)
`datadog-ffe-ffi`	1	1	No change (0%)
`datadog-ipc`	22	22	No change (0%)
`datadog-live-debugger`	4	4	No change (0%)
`datadog-live-debugger-ffi`	10	10	No change (0%)
`datadog-profiling-replayer`	4	4	No change (0%)
`datadog-sidecar`	45	45	No change (0%)
`libdd-common`	13	13	No change (0%)
`libdd-common-ffi`	12	12	No change (0%)
`libdd-data-pipeline`	6	6	No change (0%)
`libdd-ddsketch`	2	2	No change (0%)
`libdd-dogstatsd-client`	1	1	No change (0%)
`libdd-profiling`	13	13	No change (0%)
`libdd-remote-config`	3	3	No change (0%)
`libdd-telemetry`	20	20	No change (0%)
`libdd-tinybytes`	4	4	No change (0%)
`libdd-trace-normalization`	2	2	No change (0%)
`libdd-trace-obfuscation`	3	3	No change (0%)
`libdd-trace-stats`	1	1	No change (0%)
`libdd-trace-utils`	11	11	No change (0%)
Total	182	182	No change (0%)

About This Report

This report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality.

github-actions · 2026-06-29T15:45:35Z

📚 Documentation Check Results

⚠️ 267 documentation warning(s) found

📦 `libdd-library-config` - 267 warning(s)

Updated: 2026-07-03 00:57:05 UTC | Commit: d3773a9 | missing-docs job results

github-actions · 2026-06-29T15:46:57Z

🔒 Cargo Deny Results

⚠️ 1 issue(s) found, showing only errors (advisories, bans, sources)

📦 `libdd-library-config` - 1 error(s)

Show output

error[unsound]: Rand is unsound with a custom logger using `rand::rng()`
   ┌─ /home/runner/work/libdatadog/libdatadog/Cargo.lock:62:1
   │
62 │ rand 0.8.5 registry+https://github.com/rust-lang/crates.io-index
   │ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ unsound advisory detected
   │
   ├ ID: RUSTSEC-2026-0097
   ├ Advisory: https://rustsec.org/advisories/RUSTSEC-2026-0097
   ├ It has been reported (by @lopopolo) that the `rand` library is [unsound](https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#soundness-of-code--of-a-library) (i.e. that safe code using the public API can cause Undefined Behaviour) when all the following conditions are met:
     
     - The `log` and `thread_rng` features are enabled
     - A [custom logger](https://docs.rs/log/latest/log/#implementing-a-logger) is defined
     - The custom logger accesses `rand::rng()` (previously `rand::thread_rng()`) and calls any `TryRng` (previously `RngCore`) methods on `ThreadRng`
     - The `ThreadRng` (attempts to) reseed while called from the custom logger (this happens every 64 kB of generated data)
     - Trace-level logging is enabled or warn-level logging is enabled and the random source (the `getrandom` crate) is unable to provide a new seed
     
     `TryRng` (previously `RngCore`) methods for `ThreadRng` use `unsafe` code to cast `*mut BlockRng<ReseedingCore>` to `&mut BlockRng<ReseedingCore>`. When all the above conditions are met this results in an aliased mutable reference, violating the Stacked Borrows rules. Miri is able to detect this violation in sample code. Since construction of [aliased mutable references is Undefined Behaviour](https://doc.rust-lang.org/stable/nomicon/references.html), the behaviour of optimized builds is hard to predict.
   ├ Announcement: https://github.com/rust-random/rand/pull/1763
   ├ Solution: Upgrade to >=0.10.1 OR <0.10.0, >=0.9.3 OR <0.9.0, >=0.8.6 (try `cargo update -p rand`)
   ├ rand v0.8.5
     └── libdd-library-config v2.0.0

advisories FAILED, bans ok, sources ok

Updated: 2026-07-03 00:59:10 UTC | Commit: d3773a9 | dependency-check job results

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2887d0b569

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-29T15:48:20Z

+        let payload = unsafe { std::slice::from_raw_parts(payload_ptr, payload_size as usize) };
+        let context = ProcessContext::decode(payload)?;


Synchronize before decoding the publisher's payload

When read() runs concurrently with another thread calling publish()/update(), the timestamp check does not keep the old payload allocation alive: after the first published_at load, ProcessContextHandle::update can assign self.payload = payload, dropping the Vec that payload_ptr still references, and this code then builds a slice and lets prost read from a dangling pointer. The final timestamp comparison happens only after the unsafe read, so it cannot prevent a crash/UB; the same-process reader needs to coordinate with the publisher or otherwise copy from memory that cannot be freed during decode.

Useful? React with 👍 / 👎.

yeah... This file seems to want to update some sort of seqlock, but there are several problems with it, one of these is this reclamation problem; a seqlock can protects only static region. Also a seqlock depends on a counter being odd to detect in-progress changes.

Likely the solution involves in-process readers using the write lock.

lso a seqlock depends on a counter being odd to detect in-progress changes.

I think the odd thing is because of multiple writers. If you have one writer only, and multiple reader, a monotonic value should be enough to detect torn reads (which I believe is the model for the process context).

one of these is this reclamation problem; a seqlock can protects only static region

I think I don't understand what you mean by that?

@yannham

I think the odd thing is because of multiple writers.

It's not, it's to detect that a modification is progress, so the read may be give torn/inconsistent data. Assume there is no reordering:

update data

++seq

and the reader does

read seq

read data

re-read seq to confirm it's the same

then the reader may read partially updated data and see the same seq.

But anyway, I was wrong about this, because the writer uses 0 to signal the update in progress.

one of these is this reclamation problem; a seqlock can protects only static region

What I mean is that the reader may fetch a pointer to the payload, and by the time it reads it, it's been freed already. This is a problem if the memory region has been unmapped (segfault).

So I think the wrong publishing protocol was chosen for this. What the seqlock is protecting is a single pointer, so you could just update the pointer atomically with release-acquire (more or less, it also has a size, but that's not an obstacle because amd64/arm64 support 16-byte atomics, or the size could be moved to the beginning of the pointed to region). This doesn't eliminate the reclamation problem -- you'd still need something else to prevent in progress readers from segfaulting, but it does show that the seqlock buys nothing here.

Ah, I see, thanks for the clarification for the odd counter part. I didn't remember that correctly.

I guess in the process context case the information is just split across two different fields.

What I mean is that the reader may fetch a pointer to the payload, and by the time it reads it, it's been freed already. This is a problem if the memory region has been unmapped (segfault).

@ivoanjo I don't remember all the details from the spec, but I think the model here is to map a page at init time and then never unmap it (beyond in-place update)? Of course you can't enforce that from the reader side, you have to rely on the writer to be compliant.

But it seems that process_vm_readv() returns an error, and doesn't cause SEGFAULT, for invalid/unmapped addresses (not clear from the doc, but so says my clanker). So I won't be worried about unmapping the context page.

However, if the payload is being modified, I think you're right. I don't see how you can avoid:

reader see a consistent context and read it locally

writer starts and completes an update by re-allocating the payload, causing the previous one to be dropped if heap-allocated, or overwritten if stored in the mapping.

reader starts to dereference the payload ptr

This case might be even worse because you won't get EINVAL, since the memory will most likely still be mapped (even in the heap), just potential gibberish 🤔

I guess one fix would be to restrict the reader to always use the same mapping for the payload, and include in the bracket as part of the "writing". The pointer would effectively be mostly useless, and the payload would be a static part (beside the length) of the whole context.

@yannham

I think the model here is to map a page at init time and then never unmap it (beyond in-place update)?

This is the case for the header. The problem is the payload.

But it seems that process_vm_readv() returns an error, and doesn't cause SEGFAULT, for invalid/unmapped addresses

Yes, out-of-process reads don't have this problem. A more difficult question, with process_vm_readv doing plain memcpy page-by-page and having the payload and header in different pages is whether they participate correctly in the protocol, which something that needs to be seen architecture-by-architecture, but probably it's not a big obstacle.

However, if the payload is being modified, I think you're right. I don't see how you can avoid:

The way this is usually avoided for in-process readers is to use specific reclamation strategies like RCU, hazard pointers, refcounting, etc... or use read-write locks.

I think that reading gibberish, even in-process, is not that bad, because it can be detected by the seqlock protocol (assuming you follow the pointer and read all your data before re-checking the sequence number). It's just the if the memory has been unmapped that it becomes a problem.

We've already discussed in the channel, but looping back on this -- my suggestion is to either have an out-of-band mechanism for in-process reads synchronization OR use a process_vm_readv or equivalent reader.

chatgpt-codex-connector · 2026-06-29T15:48:20Z

+        name.starts_with("/memfd:OTEL_CTX")
+            || name.starts_with("[anon_shmem:OTEL_CTX]")
+            || name.starts_with("[anon:OTEL_CTX]")


Match only the exact OTEL_CTX mapping name

Because these checks use starts_with, a process that also has a mapping such as /memfd:OTEL_CTX_BACKUP or [anon:OTEL_CTX_old] can be selected before the real /memfd:OTEL_CTX entry, since /proc/self/maps is ordered by address rather than by mapping name. In that case read() will return an invalid-signature error (or read the wrong context) even though a valid OTEL context mapping exists later in the maps file; the name match should be exact, with only the kernel's separate (deleted) token handled as metadata.

Useful? React with 👍 / 👎.

Why would there be mappings with names like "/memfd:OTEL_CTX_BACKUP" or "[anon:OTEL_CTX_old]"? Yes, sure, it's technically true but... what are we supposed to do about this?

Wasn't there a suggestion on adding several of these mappings for PHP to account for it processing several service/env? Ultimately we didn't do it, but I think that's indeed the idea of the spec, that you could have several.

This implements literally what the spec says: https://github.com/open-telemetry/opentelemetry-specification/blob/14137d0022f98bb59392f82255a70f3d9c2a7d4c/oteps/profiles/4719-process-ctx.md?plain=1#L164

Locate mapping: Parse /proc/<pid>/maps and search for entries with name starting with [anon_shmem:OTEL_CTX], [anon:OTEL_CTX] or /memfd:OTEL_CTX

though I think it should say [anon_shmem:OTEL_CTX and [anon:OTEL_CTX. Or maybe it just wanted to allow the (deleted) afterwards

Yes, it's for the (deleted) stuff. It's not for multiple mappings to find.

What @bwoebi said -- the process context is expected to be a singleton for a process, and anything that's not a singleton for a process should not go in there.

E.g. if a process is 1:1 to a service, then service name goes in there (for instance, this is the plan for dd-trace-rb); if a process is somehow multi-tenant and supports multiple services (e.g. dd-trace-php), then it shouldn't go in there.

It's not supported by the spec publishing several OTEL_CTX memory locations -- readers are only looking for one so they'll either error or ignore the others.

datadog-prod-us1-3 · 2026-06-29T16:00:20Z

Tests

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 6 Pipeline jobs failed

Lint | clippy #windows-latest msrv

Lint | clippy #windows-latest nightly

Lint | clippy #windows-latest stable

View all 6 failed jobs.

ℹ️ Info

No other issues found (see more)

🧪 All tests passed
❄️ No new flaky tests detected

🎯 Code Coverage (details)
• Patch Coverage: 80.50%
• Overall Coverage: 74.42% (-0.03%)

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 99e3aaa | Docs | Datadog PR Page | Give us feedback!}

dd-octo-sts · 2026-06-29T16:27:57Z

Artifact Size Benchmark Report

aarch64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.so	7.88 MB	7.88 MB	0% (0 B) 👌
/aarch64-alpine-linux-musl/lib/libdatadog_profiling.a	85.88 MB	85.87 MB	--.01% (-13.34 KB) 💪

aarch64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.61 MB	10.61 MB	--.01% (-1.21 KB) 💪
/aarch64-unknown-linux-gnu/lib/libdatadog_profiling.a	97.08 MB	97.06 MB	--.01% (-14.97 KB) 💪

libdatadog-x64-windows

Artifact	Baseline	Commit	Change
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.dll	25.45 MB	25.45 MB	0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.lib	88.04 KB	88.04 KB	0% (0 B) 👌
/libdatadog-x64-windows/debug/dynamic/datadog_profiling_ffi.pdb	184.55 MB	184.55 MB	+0% (+8.00 KB) 👌
/libdatadog-x64-windows/debug/static/datadog_profiling_ffi.lib	945.18 MB	945.18 MB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.dll	8.32 MB	8.32 MB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.lib	88.04 KB	88.04 KB	0% (0 B) 👌
/libdatadog-x64-windows/release/dynamic/datadog_profiling_ffi.pdb	24.61 MB	24.60 MB	--.03% (-8.00 KB) 💪
/libdatadog-x64-windows/release/static/datadog_profiling_ffi.lib	49.02 MB	49.02 MB	0% (0 B) 👌

libdatadog-x86-windows

Artifact	Baseline	Commit	Change
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.dll	22.05 MB	22.05 MB	0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.lib	89.42 KB	89.42 KB	0% (0 B) 👌
/libdatadog-x86-windows/debug/dynamic/datadog_profiling_ffi.pdb	188.58 MB	188.59 MB	+0% (+8.00 KB) 👌
/libdatadog-x86-windows/debug/static/datadog_profiling_ffi.lib	934.19 MB	934.19 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.dll	6.43 MB	6.43 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.lib	89.42 KB	89.42 KB	0% (0 B) 👌
/libdatadog-x86-windows/release/dynamic/datadog_profiling_ffi.pdb	26.42 MB	26.42 MB	0% (0 B) 👌
/libdatadog-x86-windows/release/static/datadog_profiling_ffi.lib	46.63 MB	46.63 MB	0% (0 B) 👌

x86_64-alpine-linux-musl

Artifact	Baseline	Commit	Change
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.a	76.56 MB	76.55 MB	--.01% (-9.21 KB) 💪
/x86_64-alpine-linux-musl/lib/libdatadog_profiling.so	8.77 MB	8.77 MB	0% (0 B) 👌

x86_64-unknown-linux-gnu

Artifact	Baseline	Commit	Change
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.a	92.07 MB	92.06 MB	--.01% (-9.57 KB) 💪
/x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so	10.69 MB	10.69 MB	-0% (-576 B) 👌

…:Result -> io::Result

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

morrisonlevi · 2026-06-30T22:45:39Z

                fence(Ordering::SeqCst);
-                AtomicU64::from_ptr(addr_of_mut!((*header).monotonic_published_at_ns))
+                (*header)
+                    .monotonic_published_at_ns
                    .store(published_at_ns, Ordering::Relaxed);


I think the fence is unnecessary if I change the store to use Ordering::SeqCst, yeah? In the previous state, it did a SeqCst fence with a Relaxed store, I didn't change it, I just changed to use the AtomicU64 natively instead of through a ptr.

I've read a bunch of material which helped me understand some other cases better but... not precisely this one. If anyone can educate me here, I'd appreciate understanding how they'd be different and which one is desired!

Yes, and seqcst is even too strong, it only needs release. If you're interested, you can read this paper on synchronization for the seqlock algorithm.

But there is a problem that makes reasoning purely about the C++ memory model here insufficient, and that is that we need to support out-of-process readers with process_vm_readv. This requires understanding both a) what happens architecturally on the writer and b) how process_vm_readv (which I think does just memcpy page by page) and access_remote_vm work exactly.

In any case, we only need to support amd64 and arm64, so just a Release store should be enough.

My suggestion is -- let's go with the strongest barrier we have. Why? Because the process context is not performance-sensitive, and the cost of getting this detail wrong is high.

So even though we need barriers here for correctness, I don't think we should be trying to think about them in terms of performance. If we had a lock that would use a much stronger atomic and nobody would be mad if we grab a log once an hour so that's the mental model to go for here ;)

-- my 0.02c :D

I think the fence is unnecessary if I change the store to use Ordering::SeqCst, yeah? In the previous state, it did a SeqCst fence with a Relaxed store, I didn't change it, I just changed to use the AtomicU64 natively instead of through a ptr.

Yes, I believe SeqCst store is more constraining than a SeqCst fence + a relaxed store. The difference is rather technical, but it's that in the latter case, the store isn't part of the total sequential consistent order. For all other orderings, a fence with ordering O followed (resp. preceded by) a relaxed store (resp. a relaxed load) is strictly equivalent to a store (resp. a load) with ordering O. The main usage for fences is that you can only use one for several atomic operations at once, or that you can use them conditionally. Here there's no strong motivation other than following closely the spec.

Yes, and seqcst is even too strong, it only needs release. If you're interested, you can read this paper on synchronization for the seqlock algorithm.

Yup. We used SeqCst mostly because:

the process context publication isn't performance-critical (happens once or not very often)

since the whole seqlock-like thing is fishy in Rust, we upgraded to the strongest ordering.

But there is a problem that makes reasoning purely about the C++ memory model here insufficient, and that is that we need to support out-of-process readers with process_vm_readv. This requires understanding both a) what happens architecturally on the writer and b) how process_vm_readv (which I think does just memcpy page by page) and access_remote_vm work exactly.

Yes, I personally have no idea what is the memory semantics of process_vm_readv, and in any case it can't be considered by the C++ memory model, which doesn't model OS-level operations like these. So we're a bit in best-effort places.

Though, process_vm_readv can't have stronger requirements than direct memory accesses, I believe. So if we're direct-memory-access-safe, I don't see how we wouldn't be syscall-read-memory-safe.

Yeah I think the way to reason about this is about constraining what the compiler does in such a way that it can't be anything else than what we want (even though the spec may not technically promise it...)

yannham · 2026-07-01T14:55:13Z

Out of curiosity, what's the motivation for implementing a reader on the libdatadog side? A seqlock-like reader is currently impossible to implement in Rust without UB, at least in theory (see RFC). People still do it, it seems to compile fine as of today, but it's still fragile looking ahead.

cataphract · 2026-07-01T16:18:50Z

@yannham

Out of curiosity, what's the motivation for implementing a reader on the libdatadog side? A seqlock-like reader is currently impossible to implement in Rust without UB, at least in theory (see RFC). People still do it, it seems to compile fine as of today, but it's still fragile looking ahead.

In this case, we could just make the pointer and the size atomic and access them with relaxed memory order. This would remove the technicality of the race being UB. But even without it, I don't think it's much of a problem. I'm pretty sure the exact same code would be generated.

yannham · 2026-07-01T16:24:29Z

My original question still stands (as it's not written in the PR description - to be clear I haven't look at the PR code yet in details): what is the motivation for having a reader in libdatadog?

cataphract · 2026-07-01T16:39:31Z

My original question still stands (as it's not written in the PR description - to be clear I haven't look at the PR code yet in details): what is the motivation for having a reader in libdatadog?

I could be implemented somewhere else. I guess the same reason the writer is implemented in libdatadog -- to do it uniformly across tracers. And with the writer already implemented in it, and the reader being sort of implemented already for the tests, everything stays more cohesive.

That said, I have no great objection to this being moved to dd-trace-php

bwoebi · 2026-07-01T23:01:02Z

More precisely: You need a reader in (PHP) profiling to support using the profiler with another otel compatible tracer (other than our tracer) running in the same process and being able to access that tracers context data.

libdatadog is the right place for this - it's not necessarily a pure PHP concern.

morrisonlevi · 2026-07-02T00:46:12Z

In practice we're allocating a whole OS page to put this header in there, and the header is like 32 bytes. That leaves roughly 4,064 bytes (because we're talking about Linux here, and nearly everything is running 4096 pages or more on Linux) to put a serialized payload into. Why use a separate allocation for the payload? Nobody should use more than ~4k bytes for this purpose, methinks? And this utilizes otherwise wasted space. If the spec is flexible at this stage, instead of storing a ptr you can store a length for the usable portion of the page, which can vary e.g. 16 KiB on macOS if we ever port there.

I think someone may have hinted at doing this in the discussion already, and the spec hints at an optimization, but does anyone (@ivoanjo?) have a problem mandating that? Because otherwise you can't really do a safe read, if the ptr has a separate lifetime. I suppose it could point to some other safe place like a static section of memory, but that doesn't really seem too useful compared to just using the memory page. Torn reads can be handled, but a separate lifetime complicates making that torn read safe for both threads and out-of-process readers.

yannham · 2026-07-02T09:42:29Z

@morrisonlevi Regarding the allocation living elsewhere and not in the header, it's a pure matter of doing the simplest thing for the first implementation. The plan has indeed been to put it in the header as a next step. No other technical reason than TODO. I don't know why this is not mandated in the spec, or why you'd want a separate allocation. I'll leave that part to Ivo 🙂

For the correctness part, we've casually discussed with @scottgerring and @nsavoire yesterday. My current understanding is that having a separate allocation is OK because:

if you try to read a page that has been unmapped before the call to process_vm_readv, you'll get EINVAL
if you try to read a page that is being unmapped in the middle of a process_vm_readv, presumably you'll either get EINVAL or the returned number of bytes read will be lower than what you expect
if you try to read an allocation that has been deallocated but the page is still mapped, you'll successfully read gibberish

But most importantly, in any of those cases, you don't get a crash. From there, y you'll also be able to know because you check the published_at after fully reading the payload, and if the writer started a modification/dropped the payload, you should see either 0 or a different value than the first one.

cataphract · 2026-07-02T09:58:49Z

@yannham so the plan is to use process_vm_readv even for in-process reads? I checked the linux source and actually this seems doable because the kernel special cases a process reading itself and doesn't require the ptrace capability. The only problems I could find were CONFIG_CROSS_MEMORY_ATTACH=n or something like seccomp blocking it.

yannham · 2026-07-02T11:52:20Z

@yannham so the plan is to use process_vm_readv even for in-process reads? I checked the linux source and actually this seems doable because the kernel special cases a process reading itself and doesn't require the ptrace capability. The only problems I could find were CONFIG_CROSS_MEMORY_ATTACH=n or something like seccomp blocking it.

I think the spec was mainly written with an out-of-process, privileged reader in mind, typically an eBPF profiler. If you really need to read from the same process, I would be inclined to go with process_vm_readv if possible indeed, or really any syscall or Unix API that let you read your own memory without crashes/interrupts. For example ChatGPT proposes to create a pipe, write to it ssize_t n = write(pipe_fd, addr, len); and then read back safely from the pipe. write will convert converts faults or unmapped memory to an error return value. It's a bit hacky, but might work as well?

cataphract · 2026-07-02T12:49:13Z

For example ChatGPT proposes to create a pipe, write to it ssize_t n = write(pipe_fd, addr, len); and then read back safely from the pipe. write will convert converts faults or unmapped memory to an error return value. It's a bit hacky, but might work as well?

I don't think that would help. Yes, you could tell whether the memory was mapped at the point of the write call, but between then and the time of your actual read it could have become unmapped

yannham · 2026-07-02T12:54:29Z

I don't think that would help. Yes, you could tell whether the memory was mapped at the point of the write call, but between then and the time of your actual read it could have become unmapped

I would expect that if you write to a file pipe, the memory is copied somewhere, or double mapped, or whatever. In any case, however the kernel handles this, the important point is that neither write nor read are APIs that will SIGSEGV because your source buffer is invalid or becomes invalid during the operation. It will always return an error instead. Which is the property we're looking for, I believe: either those error values or the bracket check of monotonic_published_at will let us discard gibberish if something like that did happen.

In the memfd path, the header becomes immediately discoverable. Therefore, we cannot do an non-atomic write to monotonic_published_at_ns with value 0, as this could race with a reader that did an acquire read on monotonic_published_at_ns. This write is also unnecessary because the mapping comes zero-initialized, and we'd be writing a zero (so it would also _likely_ be harmless anyway, even if violating the memory model). Note that there can be no race in signature/version. These are only read if the published_at is nonzero, and the release-acquire relationship guarantees that their final non-zero values are visible if published_at is nonzero.

cataphract · 2026-07-02T13:16:54Z

I've pushed a commit that uses process_vm_readv to read the payload, among other changes. Please take a look

yannham · 2026-07-02T14:49:23Z

 //! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719)
+//! specification.


Suggested change

//! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719)

//! specification.

//! context specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4719).

yannham · 2026-07-02T15:02:47Z

+//!         seq1 = seq.load(m_o_relaxed);
+//!     } while (seq0 & 1 || seq0 != seq1);
+//!     ...
+//! }


I think this is a bit too low-level for a module-level documentation. I would remove the C-like code from here, at best give a short description, or just mention seqlock.

To me, the most useful thing about documentation like this is the intuition about why the algorithm is correct / what invariants it upholds. The implementation (even in pseudocode) should fall out of the invariants, not the other way around.

yannham · 2026-07-02T15:02:52Z

+//! instead of even numbers.
+//! We also forbid concurrent writers, and leave the reader retries to the
+//! discretion of the caller.
+//! We ignore the corner case where time returns 0.


I would keep the part about reading from syscall here, especially given the discussion on this PR around reading from the same process.

yannham · 2026-07-02T15:06:38Z

        /// Initial publication of the process context. Creates an appropriate memory mapping.
-        fn publish(payload: Vec<u8>) -> anyhow::Result<Self> {
+        fn publish(payload: Vec<u8>) -> io::Result<Self> {
+            let payload_size = payload


Nit: add a type annotation on payload_size to know when we read this line what conversion we're doing, and in which case it would fail.

yannham · 2026-07-02T15:09:45Z

-                fence(Ordering::SeqCst);
-                AtomicU64::from_ptr(addr_of_mut!((*header).monotonic_published_at_ns))
-                    .store(published_at_ns, Ordering::Relaxed);
+                ptr::addr_of_mut!((*header).signature).write(*SIGNATURE);


Should we start by writing 0 to monotonic_published_at? Usually the memory is allocated zeroed, but I'm not sure we can rely on that in every environment possible. I don't really remember all the ins and outs of the naming part but we already use MAPPING_NAME in MemMapping::new(), so I wonder if in some cases the context could already be discoverable at this point.

yes, on the mmap side, MAP_ANONYMOUS has this guarantee:

MAP_ANONYMOUS The mapping is not backed by any file; its contents are initialized to zero.

And on memfd_create + ftruncate, the ftruncate linux man page says:

If the file previously was larger than this size, the extra data is lost. If the file previously was shorter, it is extended, and the extended part reads as null bytes ('\0').

Fair enough 👍 maybe adding a small comment saying that would be a nice plus for future code readers

yannham · 2026-07-02T15:26:20Z


    // The returned size is guaranteed to be larger or equal to the size of `MappingHeader`.
-    fn mapping_size() -> usize {
+    const fn mapping_size() -> usize {


A previous version was using a non-const implementation (using libc to get the page size typically). If not especially needed I would avoid making the const guarantee. I trust rustc to be smart enough to inline and constant-propagate the same way anyway, if it's for performance reason.

yannham · 2026-07-02T15:27:43Z

+    /// Locates the OTEL_CTX mapping at construction. Call [`read`](Self::read) repeatedly to fetch
+    /// updated context data without re-parsing `/proc/self/maps`, as long as the process has not
+    /// forked. After a `fork()`, reads fail and a new reader must be constructed.
+    pub struct ProcessContextSelfReader {


Should the reader side go in a submodule? And even maybe the writer side as well? I feel like the use-case is different from the publisher, which is what SDK/tracers will ever need, and this source file starts to grow quite a bit.

yannham · 2026-07-02T15:31:00Z

+
+            let published_at = header.monotonic_published_at_ns.load(Ordering::Acquire);
+            if published_at == 0 {
+                return Err(io::Error::new(


You are going to be the consumer of this API, but don't you want to be able to detect this error? This sounds like a non fatal one, as opposed to the fork issue, or other IO errors. One possibility would be to have a retry-loop version, or just return more structured errors.

yannham · 2026-07-02T15:40:36Z

+    /// # Safety
+    ///
+    /// This function may only be called if it can be guaranteed that there are no in-process
+    /// readers, or at least that they will not be used after the call.


This should probably be mirrored in the FFI version's documentation.

yannham · 2026-07-02T15:45:13Z

+                    .store(0, Ordering::Relaxed);
+            }
+            fence(Ordering::Release);


Sounds equivalent and a bit simpler? Note that unfortunately I don't think this guarantees anything like the payload will be dropped after the store. Stuff can move up before a release operation (fence/store). The latter just guarantees that whatever happens-before the release operation, will happens-before a matching acquire store that will load 0. It makes no guarantee about what happens after the release operation. You'd need acquire, but that doesn't exist for stores. Maybe an acquire-swap would do.

Suggested change

.store(0, Ordering::Relaxed);

}

fence(Ordering::Release);

.store(0, Ordering::Release);

}

The change here is a bit of a corner case, but what we want here is that these two stores are not reordered, that we store 0 before doing the drop. The atomic release prevents reordering with memory operations that are sequenced before it, not after. So for an atomic release to work, it would have to be on the drop, which is not possible (the whole thing is also of undefined anyway, because the drop is not doing any sort of atomic operation)

The atomic release prevents reordering with memory operations that are sequenced before it, not after.

Yes, and that is my point. Here the drop happens after the release store. So you don't get any guarantee regarding the respective order of the drop and the release store from a reader that acquire-load the value 0. As far as the C++20 memory model is concerned, the drop could look like it happens before the store from the reader POV. Release doesn't prevent that re-ordering.

If the reader loads the value 0 it will not even attempt to load payload. The risk is that it doesn't read the value 0 and then attempt to read a dropped value. Note that this can still happen (it's a variation on the torn read case), it's just less likely with this change.

Or are you saying that it's a "best-effort-although-technically-incorrect" approach, where you reason as if the drop was performing atomic operations somehow?

So, yes, you're generally right that this doesn't provide any formal guarantee of correctness. For the synchronization to be complete, you'd need to read-acquire on the dropped value. This is more an attempt of having of the compiler generate a StoreStore barrier on the writer side so that it's less likely that another cpu sees value nonzero and then a dropped payload

I see. If it's a safety net, what do you think about using a stronger SeqCst barrier? On ARM for example I think it's always compiled to a DMB ISH, while release fences might be compiled to something slightly weaker, such as DMB ISHLD; DMB ISHST. If it's a best-effort/out of the memory model thing and not performance sensitive... let's be as strong as we can ? 🤷

Was going to say the same

You'd need acquire, but that doesn't exist for stores

But SeqCst does

danielsn

As a broader comment, this whole file seems large. I wonder if it would benefit from being split into modules

danielsn · 2026-07-02T19:40:13Z

+//!         seq1 = seq.load(m_o_relaxed);
+//!     } while (seq0 & 1 || seq0 != seq1);
+//!     ...
+//! }


To me, the most useful thing about documentation like this is the intuition about why the algorithm is correct / what invariants it upholds. The implementation (even in pseudocode) should fall out of the invariants, not the other way around.

danielsn · 2026-07-02T19:40:21Z

+//! instead of even numbers.
+//! We also forbid concurrent writers, and leave the reader retries to the
+//! discretion of the caller.
+//! We ignore the corner case where time returns 0.


danielsn · 2026-07-02T19:40:30Z

+//!
+//! Although we instead use 0 to signal the writer is progress and a timestamp
+//! instead of even numbers.
+//! We also forbid concurrent writers, and leave the reader retries to the


danielsn · 2026-07-02T19:40:45Z

+//!     seq.store(seq0 + 2, m_o_release);
+//! }
+//!
+//! Although we instead use 0 to signal the writer is progress and a timestamp


Instead of magic constants, can we use defined enums?

danielsn · 2026-07-02T19:41:11Z

+//! }
+//!
+//! Although we instead use 0 to signal the writer is progress and a timestamp
+//! instead of even numbers.


Why did the original algorithm constrain to even numbers? Do timestamps uphold the same invariant?

danielsn · 2026-07-02T19:50:59Z

+                    .store(0, Ordering::Relaxed);
+            }
+            fence(Ordering::Release);


Was going to say the same

You'd need acquire, but that doesn't exist for stores

But SeqCst does

danielsn · 2026-07-02T19:52:39Z

+    ///
+    /// # Safety
+    ///
+    /// This function may only be called if it can be guaranteed that there are no in-process


Is there any way to check this, even if just on debug builds?

danielsn · 2026-07-02T19:53:25Z

+            .collect()
+    }
+
+    fn find_attr<'a>(attrs: &'a [KeyValue], key: &str) -> Option<&'a AnyValue> {


Are there few enough attrs that a linear scan is fine?

danielsn · 2026-07-02T19:55:43Z

+        fn is_named_otel_mapping(line: &str) -> bool {
+            let trimmed = line.trim_end();
+
+            // The name of the mapping is the 6th column. The separator changes (both ' ' and '\t')


Is this format documented anywhere, or is this reverse engineering?

danielsn · 2026-07-02T19:57:20Z

+
+            // pairs with the first release fence on update() to ensure that, if we read data
+            // updated after the initial published time, we at least see the published
+            // time being set to 0 in the next load of the published time (or we could


Is it allowed to go back from a published time to 0?

monotonic_published_at_ns being zero is reserved to indicate the context is being mutated and is not yet ready for reading.

yannham

I wouldn't feel comfortable reviewing a PR that touches to many moving parts of the process context. We somehow arrived to an agreement on the initial version of the reader and minor publish changes. Would that be possible to revert to that state, address the last review comment one way or another, and keep other changes as separate PRs with proper motivation?

ivoanjo

I'll +1 on splitting this PR. Also, if we're making the reader directly read from the output of the writer, we can move it to a separate file as at that point they're very independent.

cataphract requested a review from a team as a code owner June 29, 2026 15:41

cataphract requested review from mabdinur and removed request for a team June 29, 2026 15:41

feat(library-config): otel process context reader

cfe279d

Add `read()`, `threadlocal_attribute_key_map()`, and `read_threadlocal_attribute_key_map()` to `otel_process_ctx::linux`, along with the `find_otel_mapping()` / `is_named_otel_mapping()` / `parse_mapping_start()`.

cataphract force-pushed the levi/otel-thread-context branch from 2887d0b to cfe279d Compare June 29, 2026 15:42

chatgpt-codex-connector Bot reviewed Jun 29, 2026

View reviewed changes

cataphract mentioned this pull request Jun 29, 2026

feat: OTel context sharing DataDog/dd-trace-php#3970

Open

2 tasks

morrisonlevi and others added 3 commits June 30, 2026 13:19

Merge branch 'main' into levi/otel-thread-context

550cb9c

refactor(library-config)!: removed packed from MappingHeader, anyhow:…

adc1fa2

…:Result -> io::Result

test(library-config): compile-time layout assertions for MappingHeader

67cc62f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

morrisonlevi reviewed Jun 30, 2026

View reviewed changes

fix(library-config): import std::io in tests

d363e03

morrisonlevi force-pushed the levi/otel-thread-context branch from 19c2470 to d363e03 Compare June 30, 2026 22:49

Merge branch 'main' into levi/otel-thread-context

89c0c57

Merge branch 'main' into levi/otel-thread-context

9589dd3

cataphract added 5 commits July 2, 2026 14:14

fix(library-config): make OTel process context reads race-safe

31b32fe

ProcessContextSelfReader to avoid rediscovering the mapping every read

f9f9cb3

make unpublish() unsafe

7b89ee6

reorder drop of payload and unpublishing of header

a432165

cargo +nightly fmt

052fb84

yannham reviewed Jul 2, 2026

View reviewed changes

danielsn reviewed Jul 2, 2026

View reviewed changes

morrisonlevi requested a review from a team as a code owner July 3, 2026 00:54

refactor: realign after meeting with Bob and Ivo

99e3aaa

morrisonlevi force-pushed the levi/otel-thread-context branch from 3379921 to 99e3aaa Compare July 3, 2026 00:55

yannham requested changes Jul 3, 2026

View reviewed changes

ivoanjo reviewed Jul 3, 2026

View reviewed changes

		let payload = unsafe { std::slice::from_raw_parts(payload_ptr, payload_size as usize) };
		let context = ProcessContext::decode(payload)?;

		//! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719)
		//! specification.

	//! context](https://github.com/open-telemetry/opentelemetry-specification/pull/4719)
	//! specification.
	//! context specification](https://github.com/open-telemetry/opentelemetry-specification/pull/4719).

Uh oh!

Conversation

cataphract commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Motivation

Additional Notes

How to test the change?

Uh oh!

github-actions Bot commented Jun 29, 2026

Clippy Allow Annotation Report

Summary by Rule

Annotation Counts by File

Annotation Stats by Crate

About This Report

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📚 Documentation Check Results

📦 libdd-library-config - 267 warning(s)

Uh oh!

github-actions Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔒 Cargo Deny Results

📦 libdd-library-config - 1 error(s)

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yannham Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cataphract Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cataphract Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

datadog-prod-us1-3 Bot commented Jun 29, 2026 • edited by datadog-official Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

ℹ️ Info

Uh oh!

dd-octo-sts Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Artifact Size Benchmark Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

cataphract commented Jun 29, 2026 •

edited

Loading

github-actions Bot commented Jun 29, 2026 •

edited

Loading

📦 `libdd-library-config` - 267 warning(s)

github-actions Bot commented Jun 29, 2026 •

edited

Loading

📦 `libdd-library-config` - 1 error(s)

yannham Jul 1, 2026 •

edited

Loading

cataphract Jul 1, 2026 •

edited

Loading

cataphract Jul 1, 2026 •

edited

Loading

datadog-prod-us1-3 Bot commented Jun 29, 2026 •

edited by datadog-official Bot

Loading

dd-octo-sts Bot commented Jun 29, 2026 •

edited

Loading

yannham Jul 1, 2026 •

edited

Loading

yannham Jul 1, 2026 •

edited

Loading

yannham commented Jul 1, 2026 •

edited

Loading

cataphract commented Jul 1, 2026 •

edited

Loading

bwoebi commented Jul 1, 2026 •

edited

Loading

morrisonlevi commented Jul 2, 2026 •

edited

Loading

cataphract commented Jul 2, 2026 •

edited

Loading