Make provenance and evidence traceability first-class for proto (canonical protobuf definitions cross-service contracts)

## Summary

Carry source, decision, and output provenance through the main workflow so downstream agents can audit and cite it.

This issue was generated from an org-wide EvalOps mining pass on 2026-05-10 07:57 UTC. It combines live GitHub repo signals with a per-repo arXiv search. Treat the research links as grounding for a concrete implementation, not as a request for a literature review.

## Repo Evidence

- Repository description: Canonical protobuf definitions for EvalOps cross-service contracts — 14 packages covering identity, metering, governance, approvals, entities, memory, prompts, skills, events, and more
- Tree signals: 0 docs files, 6 workflows, 31 proto files, 37 test-like files.
- `README.md:15` includes latent-spec language: - Generate Go, TypeScript, and Python packages from one set of definitions. - Catch breaking contract changes at compile time, not in production. - Keep wire encoding efficient with protobuf binary format.
- `README.md:16` includes latent-spec language: - Catch breaking contract changes at compile time, not in production. - Keep wire encoding efficient with protobuf binary format. - Support Connect-RPC service definitions for typed cross-service calls.
- `README.md:41` includes latent-spec language: | `entities/v1` | Canonical entity resolution, search, links, correlation graph | entities, pipeline, parker, connectors, ensemble | | `governance/v1` | Safety evaluation, PII detection, retention, legal holds | governance, approvals, objectives, ensemble, chat | | `notifications/v1` | Notification delivery, preference
- `README.md:48` includes latent-spec language: | `workflows/v1` | Multi-agent workflow orchestration, DAG execution, compensation | objectives, ensemble, maestro, approvals, governance | | `prompts/v1` | Prompt versioning, deployment tracking, eval linkage, resolution | prompts, llm-gateway, fermata, maestro, ensemble | | `registry/v1` | Agent presence, capability 
- `README.md:53` includes latent-spec language: ### Contract Ownership
- `README.md:229` includes latent-spec language: `proto` also publishes the generated TypeScript contract surface from the repo root as `@evalops/proto`.

## Research Grounding

Repo axes: research, evaluation, tooling, security

Search keywords: proto, chat, ensemble, approvals, objectives, registry, governance, maestro, entities, gen, pipeline, prompts

- [arXiv:2506.19773v2](https://arxiv.org/abs/2506.19773v2) Automatic Prompt Optimization for Knowledge Graph Construction: Insights from an Empirical Study (Nandana Mihindukulasooriya, Niharika S. D'Souza, Faisal Chowdhury, Horst Samulowitz), 2025.
- [arXiv:2503.11118v1](https://arxiv.org/abs/2503.11118v1) UMB@PerAnsSumm 2025: Enhancing Perspective-Aware Summarization with Prompt Optimization and Supervised Fine-Tuning (Kristin Qi, Youxiang Zhu, Xiaohui Liang), 2025.
- [arXiv:2507.03620v1](https://arxiv.org/abs/2507.03620v1) Is It Time To Treat Prompts As Code? A Multi-Use Case Study For Prompt Optimization Using DSPy (Francisca Lemos, Victor Alves, Filipa Ferraz), 2025.
- [arXiv:2412.15298v1](https://arxiv.org/abs/2412.15298v1) A Comparative Study of DSPy Teleprompter Algorithms for Aligning Large Language Models Evaluation Metrics to Human Evaluation (Bhaskarjit Sarmah, Kriti Dutta, Anna Grigoryan, Sachin Tiwari, Stefano Pasquali, Dhagash Mehta), 2024.
- [arXiv:2604.04869v1](https://arxiv.org/abs/2604.04869v1) Optimizing LLM Prompt Engineering with DSPy Based Declarative Learning (Shiek Ruksana, Sailesh Kiran Kurra, Thipparthi Sanjay Baradwaj), 2026.
- [arXiv:2507.14241v3](https://arxiv.org/abs/2507.14241v3) Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models (Rithesh Murthy, Ming Zhu, Liangwei Yang, Jielin Qiu, Juntao Tan, Shelby Heinecke), 2025.
- [arXiv:2310.03714v1](https://arxiv.org/abs/2310.03714v1) DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines (Omar Khattab, Arnav Singhvi, Paridhi Maheshwari, Zhiyuan Zhang, Keshav Santhanam, Sri Vardhamanan), 2023.
- [arXiv:2508.04660v1](https://arxiv.org/abs/2508.04660v1) Multi-module GRPO: Composing Policy Gradients and Prompt Optimization for Language Model Programs (Noah Ziems, Dilara Soylu, Lakshya A Agrawal, Isaac Miller, Liheng Lai, Chen Qian), 2025.
- [arXiv:2602.00997v1](https://arxiv.org/abs/2602.00997v1) Error Taxonomy-Guided Prompt Optimization (Mayank Singh, Vikas Yadav, Eduardo Blanco), 2026.
- [arXiv:2602.13757v2](https://arxiv.org/abs/2602.13757v2) Assessing the Case for Africa-Centric AI Safety Evaluations (Gathoni Ireri, Cecil Abungu, Jean Cheptumo, Sienka Dounia, Mark Gitau, Stephanie Kasaon), 2026.

## What To Build

- Add stable identifiers for source records, derived decisions, and emitted outputs.
- Thread those identifiers through logs/events/API responses without leaking secrets.
- Provide a query or debug surface that reconstructs the chain for one completed workflow.

## Acceptance Criteria

- [ ] A short design note names the repo-specific workflow, threat or correctness model, and the research assumptions being adopted.
- [ ] A runnable check, fixture, or verifier exercises the new contract in CI or an equivalent local command documented in the repo.
- [ ] The implementation emits or stores enough evidence for a downstream agent/operator to cite inputs, decisions, and outputs.
- [ ] At least one negative/degraded-mode case is covered so failures are observable rather than silently accepted.
- [ ] Documentation links the new behavior to the relevant EvalOps platform primitive or explicitly records why this repo remains standalone.

## Notes

- Generated issue 2/5 for `evalops/proto` by `evalops_org_miner.py`.
- Before implementation, confirm the sampled latent-spec snippets still match `main`; this issue intentionally cites exact file paths/lines where the mining pass saw them.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make provenance and evidence traceability first-class for proto (canonical protobuf definitions cross-service contracts) #96

Summary

Repo Evidence

Research Grounding

What To Build

Acceptance Criteria

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Make provenance and evidence traceability first-class for proto (canonical protobuf definitions cross-service contracts) #96

Description

Summary

Repo Evidence

Research Grounding

What To Build

Acceptance Criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions