Skip to content

[WIP] Fix minimum-permissions and custom-tag documentation gaps#37965

Draft
kevinzenghu wants to merge 1 commit into
masterfrom
kevinzenghu/fix-permissions-and-tag-docs
Draft

[WIP] Fix minimum-permissions and custom-tag documentation gaps#37965
kevinzenghu wants to merge 1 commit into
masterfrom
kevinzenghu/fix-permissions-and-tag-docs

Conversation

@kevinzenghu

Copy link
Copy Markdown
Contributor

Summary

Sourced from mining ~5 months of Slack support/sales archives for recurring patterns (see linked analysis): "minimum permissions" confusion (17 threads) and "custom tags not propagating" (recurring since Nov 2025). Each fix is grounded in crawler/processor code, not just the reported symptom.

Fixes

  1. Glue IAM policy missing glue:GetTags — confirmed via libs/glue/client.go (crawler calls GetTags) and a dedicated crawler warning path for exactly this access-denied case.
  2. Databricks Jobs Monitoring Workspace Admin requirement — added a Prerequisites section; this was previously buried in an Advanced Configuration subsection and never contrasted with Quality Monitoring (which doesn't need it).
  3. dbt Cloud silent token-scope failures — added Troubleshooting, grounded in DbtCloudHealthWorker's existing authorization-error checks.
  4. Airflow DAG-tag auto-propagation — documented that this already works with zero config (confirmed via GetAirflowTags()), previously undocumented.
  5. Kubernetes missing -Ddd.tags — added, matching the pattern already documented for EMR/Dataproc.

Found but explicitly not fixed here

  • Databricks custom-tag auto-capture already shipped a doc fix on 2026-06-01 (commit bcef0674fd) — reps in Slack are giving a now-stale "no auto-propagation" answer. This is a team-communication gap, not a doc gap — flagging for whoever owns support enablement, not fixing here.
  • BigQuery Drive/Sheets external tables needing extra credentials — confirmed unhandled and undocumented, but I couldn't determine the exact required role/scope from code. Needs the BigQuery crawler owner's input before writing content.

Status

Work in progress — not ready for review yet. Draft, no reviewers requested.

AI assistance

Found and fixed by Claude Code via a targeted research pass grounded in dd-source code for each specific gap — flagging per the AI-assistance disclosure in CONTRIBUTING.md.

Sourced from mining ~5 months of Slack support/sales archives for
recurring patterns: "minimum permissions" confusion (17 distinct
threads) and "custom tags not propagating" (recurring since Nov 2025,
same unresolved answer every time). Each fix below is grounded in
crawler code, not just the reported symptom.

## Permissions
- glue.md: IAM policy was missing `glue:GetTags` — confirmed via
  libs/glue/client.go (the crawler calls GetTags) and the crawler's
  dedicated warning path (glue_crawler.go) for exactly this
  access-denied case. Jobs still sync without it, just untagged, which
  makes the gap easy to miss — added the policy action plus a note
  explaining the symptom.
- jobs_monitoring/databricks/_index.md: added a Prerequisites section
  stating Jobs Monitoring requires Workspace Admin for the recommended
  install path (unlike Quality Monitoring, which doesn't) — this was
  previously only mentioned deep in an "Advanced Configuration >
  Permissions" subsection, never up front or contrasted with QM.
- dbt.md: added a Troubleshooting section for silent webhook/connection
  failures caused by a token's permission scope being lowered after
  setup — grounded in DbtCloudHealthWorker's authorization-error paths
  and VerifyAccess's 403 check (shared/libs/dbtcloud/client.go), which
  already test for exactly this but weren't surfaced in docs.

## Custom tags
- airflow.md: added a "Custom tags" section documenting that DAG-level
  `tags=[...]` auto-propagate to Jobs Monitoring with zero configuration
  — confirmed via lineage-processor's GetAirflowTags() and the OpenLineage
  DAG facet parsing, but was completely undocumented despite being true.
- kubernetes.md: added `-Ddd.tags` JVM-option documentation, matching
  the DD_TAGS pattern already documented for EMR and Dataproc but
  missing here.

## Explicitly not changed (flagging instead of guessing)
- Databricks Quality/Jobs Monitoring custom-tag auto-capture: the docs
  already got fixed for this on 2026-06-01 (commit bcef067) — native
  cluster tags now auto-propagate except Azure resource-group tags. The
  Slack "no auto-propagation" answer reps keep giving is now STALE, not
  the docs — this is a team-communication gap, not something for this
  PR to fix. Worth telling the support/sales team the answer changed.
- BigQuery external tables backed by Google Sheets/Drive needing extra
  credentials/scope: confirmed via code that this isn't handled by
  existing error-handling paths (unlike the two other BigQuery
  external-table failure modes, which are) and isn't documented — but I
  couldn't confirm the exact required role/scope from code alone. Needs
  input from the BigQuery crawler owner before writing content; not
  guessing at Drive API scope names.

AI assistance: found and fixed by Claude Code via a targeted research
pass grounded in dd-source crawler/processor code for each specific
Slack-reported gap — flagging per the AI-assistance disclosure in
CONTRIBUTING.md.
@datadog-prod-us1-6

Copy link
Copy Markdown
Contributor

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

DataDog/documentation | build_preview   View in Datadog   GitLab

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 2aead55 | Docs | Datadog PR Page | Give us feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant