Add databricks-serverless-storage-check skill#82
Open
GabbysCode wants to merge 4 commits into
Open
Conversation
Collaborator
|
We will reach out directly about possible consolidation of this PR and one of our Field Eng maintainers will add a review once we feel its ready. |
added 3 commits
June 15, 2026 14:38
…disk handoffs Adds a new skill `databricks-serverless-storage-check` that ships an executable preflight scanner for the antipattern where parent/child tasks share state through /local_disk0, /tmp, or trustedTemp paths -- the failure seen in serverless jobs that fail with `INTERNAL_ERROR: [Errno 13] Permission denied` on local-disk paths. The scanner (scripts/preflight.py, stdlib-only, AST + regex) supports five input modes (--notebook, --dir, --job-yaml, --job-id, --run-id) and 7 detection rules (FANOUT001-006 plus ENV001 which routes env-sync errors to support escalation). All 7 self-tests pass. Complementary to databricks-serverless-migration (single-notebook migration). Added a one-line cross-reference from that skill's data-access table pointing here for multi-task fan-out concerns. Includes the required agents/openai.yaml (hand-authored) and SKILL_METADATA entry in scripts/skills.py; manifest regenerated and `python3 scripts/skills.py validate` passes. Signed-off-by: GABRIELLE DOMPREH <Gabby.dompreh@databricks.com>
…ruth.yaml Adds generation_session_id and sources to all four test cases so stf lint passes cleanly (0 errors, 0 warnings). Co-authored-by: Isaac
…s-serverless-migration
Moves skills/databricks-serverless-storage-check/ →
skills/databricks-serverless-migration/databricks-serverless-storage-check/
to physically reflect the parent/sub-skill hierarchy.
- Update scripts/skills.py iter_skill_dirs() to yield one level of nested
skill directories (sub-skills alongside top-level skills)
- Fix relative SKILL.md links: ../databricks-serverless-migration/ → ../
and ../databricks-{dabs,jobs,core}/ → ../../databricks-{dabs,jobs,core}/
- Fix databricks-serverless-migration/SKILL.md links:
../databricks-serverless-storage-check/ → databricks-serverless-storage-check/
- Regenerate manifest.json (skill count unchanged: 9)
- Confirmed: python3 scripts/skills.py validate → Everything is up to date
- Confirmed: stf lint → 0 errors, 0 warnings
Co-authored-by: Isaac
9fcbae3 to
a44f32f
Compare
|
Hi @GabbysCode - can you restructure to match the other skills. So for example, your markdown files should be references for |
jacksandom
reviewed
Jun 17, 2026
There was a problem hiding this comment.
Should be able to remove this and assets as it is captured in the main skill.
jacksandom
suggested changes
Jun 17, 2026
…t skill structure - Merge pattern-catalog.md + remediation-guide.md into a single skills/databricks-serverless-migration/references/serverless-storage-check.md, matching how sibling reference docs are structured in the parent skill - Remove databricks-serverless-storage-check/references/ (now consolidated above) - Remove databricks-serverless-storage-check/agents/ and assets/ (captured at the parent skill level) - Update SKILL.md links from local references/ to ../references/serverless-storage-check.md - Regenerate manifest.json Co-authored-by: Isaac
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
databricks-serverless-storage-checkas a nested sub-skill underskills/databricks-serverless-migration/— physically reflecting the parent/sub-skill hierarchy. The skill ships an executable preflight scanner detecting the antipattern where serverless tasks share state through/local_disk0,/tmp, ortrustedTemppaths. This is the failure mode behindINTERNAL_ERROR: [Errno 13] Permission denied: '/local_disk0/.../trustedTemp.../...', where a parent task writes to local disk and a child task on a different node cannot read it.Complementary to
databricks-serverless-migration(which covers single-notebook migration and correctly recommends/local_disk0/tmpfor intra-task scratch). This sub-skill covers the cross-task case.New location
Contents
skills/databricks-serverless-migration/databricks-serverless-storage-check/SKILL.md.../agents/openai.yaml.../scripts/preflight.py--jsonflag, exit codes 0/1/2.../scripts/test_preflight.py.../references/pattern-catalog.md.../references/remediation-guide.mdtaskValues/ pipeline-downstream handoffs.../eval/ground_truth.yamlstf lintclean (0 errors, 0 warnings).../eval/thinking_instructions.md.../eval/output_instructions.md.../eval/manifest.yamlSupporting changes
scripts/skills.py:iter_skill_dirs()extended to yield one level of nested skill directories, so the manifest generator and validator discover sub-skills alongside top-level skillsskills/databricks-serverless-migration/SKILL.md: Sub-skills table added; relative links updated todatabricks-serverless-storage-check/SKILL.mdmanifest.json: Regenerated (skill count unchanged: 9)Detection rules
FANOUT001dbutils.notebook.run,taskValues.set, or job-task parameterFANOUT002/local_disk0or/tmpvia widget, parameter, ortaskValues.getFANOUT003FANOUT004pipeline_taskimmediately downstream of anotebook_taskthat wrote to local tempFANOUT005dbutils.fs.cplocal-to-local inside a notebook invoked by a multi-task job (heuristic)FANOUT006/local_disk0/spark-*/trustedTemp/...anywhere in sourceENV001--run-idmode only: routesENVIRONMENT_SETUP_ERROR.PYTHON_NOTEBOOK_ENVIRONMENTto support escalationValidation
python3 scripts/skills.py validate—Everything is up to date.python3 .../scripts/test_preflight.py— 7/7 passingstf lint .../databricks-serverless-storage-check— 0 errors, 0 warningsChecklist
python3 scripts/skills.py validatepassesscripts/skills.py iter_skill_dirs()updated for nested skill discoverySKILL_METADATAentry present inscripts/skills.pyagents/openai.yamlhand-authoredSKILL.mdbody under 250 lines (149 lines)trustedTemp,local_disk0,permission denied,fan-out,cross-taskstf lintpasses (0 errors, 0 warnings)ground_truth.yaml(4 cases),thinking_instructions.md,output_instructions.md,manifest.yaml