ARPAHLS · rosspeili · Jun 10, 2026 · Jun 10, 2026 · Jun 10, 2026
diff --git a/.github/workflows/codeql.yml b/.github/workflows/codeql.yml
@@ -29,7 +29,7 @@ jobs:
 
     # Initializes the CodeQL tools for scanning.
     - name: Initialize CodeQL
-      uses: github/codeql-action/init@v2
+      uses: github/codeql-action/init@v3
       with:
         languages: ${{ matrix.language }}
         # If you wish to specify custom queries, you can do so here or in a config file.
@@ -41,7 +41,7 @@ jobs:
     # Autobuild attempts to build any compiled languages  (C/C++, C#, Go, or Java).
     # If this step fails, then you should remove it and run the build manually (see below)
     - name: Autobuild
-      uses: github/codeql-action/autobuild@v2
+      uses: github/codeql-action/autobuild@v3
 
     # ℹ️ Command-line programs to run using the OS shell.
     # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
@@ -54,6 +54,6 @@ jobs:
     #     ./location_of_script_within_repo/buildscript.sh
 
     - name: Perform CodeQL Analysis
-      uses: github/codeql-action/analyze@v2
+      uses: github/codeql-action/analyze@v3
       with:
         category: "/language:${{matrix.language}}"
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,9 @@ Contributors add user-facing entries under `[Unreleased]` in the same PR. Mainta
 
 ## [Unreleased]
 
+### Changed
+- **Documentation**: [TESTING.md](docs/TESTING.md), [CONTRIBUTING.md](CONTRIBUTING.md), [ai_native_workflow.md](docs/contributing/ai_native_workflow.md), and README architecture tree document the bundle / framework / maintainer / example testing model. Pytest collects `tests/` and `skills/` only (`examples/` ignored).
+
 ## [0.3.5] - 2026-06-05
 
 ### Added

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -28,7 +28,7 @@ Pick the path that matches your issue. Only the **skill** row requires the full
 
 | Type | What you change | Typical issue label | Before coding | Verify locally |
 | :--- | :--- | :--- | :--- | :--- |
-| **New or updated skill** | `skills/<category>/<name>/`, `docs/skills/`, templates | Skill proposal, enhancement | Skill proposal or approved issue | `pytest` for skill + `tests/test_skill_issuer.py` |
+| **New or updated skill** | `skills/<category>/<name>/`, `docs/skills/`, templates | Skill proposal, enhancement | Skill proposal or approved issue | Bundle test + `pytest tests/test_skill_issuer.py` (see [TESTING.md](docs/TESTING.md)) |
 | **Documentation** | `docs/`, `README.md`, `CONTRIBUTING.md` | Documentation, good first issue | Doc issue or typo/fix issue | Links valid; tone consistent |
 | **Core framework** | `skillware/core/`, `tests/` | Framework feature | Framework feature issue | `pytest tests/`; update usage docs if API changes |
 | **Bug fix** | Paths named in issue | Bug report | Reproduction or failing test | Targeted test + full `pytest tests/` when touching shared code |
@@ -76,10 +76,10 @@ git checkout -b feat/issue-<number>-short-description
 ### 4. Install dependencies
 
 ```bash
-pip install -e .[dev]
+pip install -e ".[dev,all]"
 ```
 
-See [TESTING.md](docs/TESTING.md) for formatting, linting, and pytest usage.
+See [TESTING.md](docs/TESTING.md) for the bundle / framework / maintainer / example model and pytest usage.
 
 ### 5. Implement and verify
 
@@ -110,10 +110,27 @@ Follow the [Agent Code of Conduct](CODE_OF_CONDUCT.md): deterministic skill outp
 
 ### Tests and CI
 
-- Add or update tests when behavior changes.
-- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest tests/`** only. Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`.
-- Run `python -m black --check .`, `python -m flake8 .`, and `pytest tests/` locally before opening a PR (same scope as CI).
-- For skill work, also run `pytest skills/<category>/<skill_name>/test_skill.py` locally and install any packages from that skill's `manifest.yaml` `requirements`.
+- Add or update tests in the correct layer when behavior changes (see [TESTING.md](docs/TESTING.md)).
+- **Skill bundle test** — `skills/<category>/<name>/test_skill.py` (required for new skills; ships in the wheel; run locally before skill PRs).
+- **Framework test** — `tests/test_*.py` at repo root (loader, CLI, issuer rules).
+- **Maintainer skill test** — optional `tests/skills/<category>/test_<name>.py` for extra loader or edge-case coverage.
+- **Usage examples** — `examples/*.py` are not tests and are not run in CI.
+- **GitHub Actions** installs `pip install -e ".[dev,all]"`, runs `python -m black --check .`, then `flake8 .`, then **`pytest tests/`** (framework + maintainer tests). Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`.
+- Run locally before opening a PR:
+
+  ```bash
+  python -m black --check .
+  python -m flake8 .
+  python -m pytest tests/
+  ```
+
+  For skill work, also run:
+
+  ```bash
+  python -m pytest skills/<category>/<skill_name>/test_skill.py
+  ```
+
+  Install packages from that skill's `manifest.yaml` `requirements` when they are not covered by `[all]`.
 - Wait for GitHub Actions CI to pass before requesting review.
 
 ### Pull request template
@@ -231,10 +248,13 @@ The primary guide for the host LLM.
 - Describes UI presentation (`name`, `description`, `icon`, `ui_schema`, and similar).
 - When present, include an `issuer` object that matches `manifest.yaml` (`name` and `email` at minimum; copy `github` and `org` when used).
 
-### 5. `test_skill.py` (validation)
+### 5. `test_skill.py` (bundle test)
 
-- Unit tests for schema compliance and deterministic execution paths.
+- **Required** for every new registry skill (template: `templates/python_skill/test_skill.py`).
+- Unit tests for schema compliance and deterministic execution paths (offline; mock externals).
+- Ships inside the skill bundle via `pip install skillware`.
 - Run: `pytest skills/<category>/<skill_name>/test_skill.py`
+- Optional extra depth for maintainers: `tests/skills/<category>/test_<skill_name>.py` — see [TESTING.md](docs/TESTING.md).
 
 ### Packaging (PyPI and `pip install`)
 

diff --git a/README.md b/README.md
@@ -55,15 +55,15 @@ documentation. Runnable provider scripts are indexed in
 ```text
 Skillware/
 ├── docs/                       # Introduction, testing, skill catalog, usage guides (docs/usage/)
-├── examples/                   # Provider reference scripts (Gemini, Claude, OpenAI, Ollama, ...)
+├── examples/                   # Provider reference scripts — usage demos, not pytest (see examples/README.md)
 ├── skills/                     # Skill Registry
 │   └── category/               # Domain boundaries (e.g., finance)
 │       └── skill_name/         # The Skill bundle
 │           ├── manifest.yaml   # Definition, schema, and constitution
 │           ├── skill.py        # Executable Python logic
 │           ├── instructions.md # Cognitive map for the LLM
 │           ├── card.json       # Optional UI presentation metadata
-│           └── test_skill.py   # Unit tests and schema validation
+│           └── test_skill.py   # Bundle test (required for new skills; see docs/TESTING.md)
 ├── skillware/                  # Core Framework Package
 │   ├── cli.py                  # Command-line interface
 │   └── core/
@@ -72,7 +72,9 @@ Skillware/
 │       └── loader.py           # Universal Skill Loader and Model Adapter
 ├── templates/                  # Boilerplate templates for new skills
 │   └── python_skill/           # Standard template with required files
-└── tests/                      # Automated test suite
+└── tests/                      # Clone-repo tests (framework + optional maintainer skill tests)
+    ├── test_*.py               # Framework tests (loader, CLI, issuer, …)
+    └── skills/                 # Optional maintainer skill tests (edge cases)
 ```
 
 ## Quick Start

diff --git a/docs/TESTING.md b/docs/TESTING.md
@@ -2,9 +2,11 @@
 
 Skillware maintains high standards for code quality and reliability. Before submitting a Pull Request, please ensure your code passes all linting and testing checks.
 
+Tests fall into four layers: **bundle**, **framework**, **maintainer**, and **example**. Use that vocabulary consistently in docs and PRs.
+
 ## Quick Setup
 
-Install framework tests, lint tools, and optional SDK extras in one go (matches GitHub Actions CI):
+Install lint tools, pytest, and optional skill runtime deps in one go (matches GitHub Actions CI):
 
 ```bash
 pip install -e ".[dev,all]"
@@ -16,17 +18,66 @@ Or use the dev pointer file:
 pip install -r requirements.txt
 ```
 
+## Four test layers
+
+| Layer | Location | Shipped in pip wheel? | CI on PR? |
+| :--- | :--- | :---: | :---: |
+| **Skill bundle test** | `skills/<category>/<skill_name>/test_skill.py` | Yes | No — run locally for skill PRs |
+| **Framework test** | `tests/test_*.py` (not under `tests/skills/`) | No (clone only) | Yes |
+| **Maintainer skill test** | `tests/skills/<category>/test_<name>.py` | No (clone only) | Yes when present |
+| **Usage example** | `examples/*.py` | No | No — not pytest |
+
+### Skill bundle test
+
+- Lives **inside the skill bundle**; ships with `pip install skillware`.
+- **Required** for every new registry skill (see `templates/python_skill/test_skill.py`).
+- Offline and mockable: manifest consistency, validation, deterministic `execute()` paths — no live network.
+- Run locally: `pytest skills/<category>/<skill_name>/test_skill.py` or `pytest skills/`.
+- Install packages from the skill's `manifest.yaml` `requirements` when they are not already satisfied by `[all]`.
+
+### Framework test
+
+- Core engine health: loader, CLI, issuer rules, version policy.
+- Lives at the **root of `tests/`** only (`tests/test_loader.py`, `tests/test_cli.py`, …).
+- Clone-repo only; runs in CI via `pytest tests/` together with maintainer tests below.
+
+### Maintainer skill test
+
+- **Optional** extra depth for skill maintainers: loader wiring, heavy mocks, edge cases.
+- Not required for every skill; when present, runs in CI as part of `pytest tests/`.
+- Example: `tests/skills/compliance/test_tos_evaluator.py`.
+
+### Usage example
+
+- Runnable provider demos under `examples/` — **not tests**.
+- Never collected by pytest; never run in CI. May need real API keys.
+- See [examples/README.md](../examples/README.md).
+
+## Which tests go where?
+
+| You are testing… | Put it here | Example in this repo |
+| :--- | :--- | :--- |
+| Manifest + execute contract for one skill | Bundle test | `skills/compliance/tos_evaluator/test_skill.py` |
+| Loader path + mocked externals (optional depth) | Maintainer test | `tests/skills/compliance/test_tos_evaluator.py` |
+| Loader, CLI, registry issuer rules | Framework test | `tests/test_loader.py`, `tests/test_skill_issuer.py` |
+| End-to-end provider demo script | Usage example | `examples/gemini_tos_evaluator.py` |
+
+**Rule of thumb:** if it ships with the skill and must pass before merge → **bundle test** (run locally). If it is extra regression depth for clone-repo work → **maintainer test** (optional). If it proves provider integration → **example**, not pytest.
+
 ## 1. Code Formatting (Black)
 
 We use **Black** as our uncompromising code formatter. It ensures that all code looks the same, regardless of who wrote it, eliminating discussions about style.
 
 ### Installation
+
 ```bash
 pip install black
 ```
 
 ### Usage
+
 Run Black on the entire repository to automatically fix formatting issues:
+
 ```bash
 python -m black .
 ```
@@ -38,12 +89,15 @@ Run `python -m black --check .` to verify formatting without writing files. GitH
 We use **Flake8** to catch logic errors, unused imports, and other code quality issues that Black does not handle.
 
 ### Installation
+
 ```bash
 pip install flake8
 ```
 
 ### Usage
+
 Run Flake8 from the root of the repository:
+
 ```bash
 python -m flake8 .
 ```
@@ -52,44 +106,63 @@ python -m flake8 .
 
 ## 3. Unit Tests (Pytest)
 
-We use **pytest** for unit testing. All new features and bug fixes must be accompanied by relevant tests.
+We use **pytest** for automated tests. All new features and bug fixes must be accompanied by relevant tests in the correct layer (see above).
 
 ### Installation
+
 ```bash
 pip install pytest
 ```
 
-### Usage
+### CI (GitHub Actions)
 
-**CI and framework tests** — GitHub Actions runs only the `tests/` tree:
+GitHub Actions installs `pip install -e ".[dev,all]"`, then runs:
 
 ```bash
+python -m black --check .
+python -m flake8 .
 python -m pytest tests/
 ```
 
-This covers the loader, CLI, issuer rules, and integration tests under `tests/skills/`. New skill PRs do not need edits to `.github/workflows/ci.yml` when they add co-located skill tests.
+That covers **framework tests** and **maintainer skill tests** under `tests/`. It does not run `examples/` or skill bundle tests. Do not add per-skill pip lines or test paths to `.github/workflows/ci.yml`.
+
+Skill deps belong in each skill's `manifest.yaml` `requirements`; mirror them in `pyproject.toml` optional extras when contributors need a one-shot install via `[all]`.
 
-### Testing individual skills (local / pre-PR)
+### Local commands
 
-Every skill ships with a `test_skill.py` boilerplate. Run it **locally** before opening a skill PR (not in CI):
+Match CI, and run bundle tests when you touch skills:
+
+```bash
+python -m pytest tests/
+python -m pytest skills/
+```
+
+Single skill bundle test:
 
 ```bash
 python -m pytest skills/<category>/<skill_name>/test_skill.py
 ```
 
-Install any packages listed in the skill's `manifest.yaml` `requirements` before running co-located tests (for example `pip install web3` for DeFi skills). Skill deps are declared in the manifest, not added to CI per skill.
+Optional maintainer depth only:
+
+```bash
+python -m pytest tests/skills/<category>/test_<skill_name>.py
+```
+
+Pytest is configured to collect from `tests/` and `skills/` only (`examples/` is ignored). See `[tool.pytest.ini_options]` in `pyproject.toml`.
+
+### Writing tests
 
-### Writing Tests
-- **Framework tests**: Place core and cross-skill integration tests in `tests/` (including `tests/skills/` when appropriate).
-- **Skill bundle tests**: Place skill-specific logic in `skills/<category>/<name>/test_skill.py` and run locally.
-- Use `conftest.py` for shared fixtures (e.g., mocking LLM clients).
+- **Bundle test:** `skills/<category>/<name>/test_skill.py` — required for new skills; copy from `templates/python_skill/test_skill.py`.
+- **Maintainer test:** `tests/skills/<category>/test_<name>.py` — optional; use shared fixtures in `tests/conftest.py` when helpful.
+- **Framework test:** `tests/test_*.py` at repo root — for loader, CLI, issuer, and cross-cutting rules.
 
 ## Pre-Commit Checklist
 
-Before pushing your code, run the following commands to ensure your changes are ready for review:
+Before pushing your code, run the following commands:
 
-1. `skillware list` (Verify install and path resolution are working)
-2. `python -m black --check .` (Verify formatting; use `python -m black .` to fix)
-3. `python -m flake8 .` (Check quality)
-4. `python -m pytest tests/` (Verify framework functionality — same scope as CI)
-5. `python -m pytest skills/<category>/<skill_name>/test_skill.py` when your PR touches that skill (local only)
+1. `skillware list` (verify install and path resolution)
+2. `python -m black --check .` (verify formatting; use `python -m black .` to fix)
+3. `python -m flake8 .` (check quality)
+4. `python -m pytest tests/` (framework + maintainer tests — same scope as CI)
+5. `python -m pytest skills/<category>/<skill_name>/test_skill.py` when your PR adds or changes a skill bundle test (or `pytest skills/` for broad skill changes)
diff --git a/docs/contributing/ai_native_workflow.md b/docs/contributing/ai_native_workflow.md
@@ -51,7 +51,7 @@ git remote add upstream https://github.com/ARPAHLS/skillware.git
 git fetch upstream
 git checkout main
 git pull upstream main
-pip install -e .[dev]
+pip install -e ".[dev,all]"
 git checkout -b feat/issue-<number>-short-description
 ```
 
@@ -160,7 +160,7 @@ Run a **pre-PR audit** on yourself:
 1. Map every acceptance criterion in the issue to a file or test in your diff.
 2. Complete the [verification checklist](#verification-checklists-by-contribution-type) for your contribution type.
 3. If the change is user-visible, confirm [CHANGELOG.md](../../CHANGELOG.md) has entries under `[Unreleased]` (same rule as [CONTRIBUTING.md](../../CONTRIBUTING.md)).
-4. Run `flake8` and `pytest`; report actual command output to your operator—do not claim success without evidence.
+4. Run `flake8` and `pytest tests/`; for skill work also run the relevant `pytest skills/.../test_skill.py`. Report actual command output to your operator—do not claim success without evidence.
 5. Draft PR template answers: check only boxes that apply; fill the skill section only if `skills/` changed.
 
 If anything fails, return to Stage 4, fix, and audit again.
@@ -260,7 +260,7 @@ Complete the checklist that matches your issue during Stage 5.
 - [ ] `skill.py`: deterministic, JSON-serializable returns, safe error handling
 - [ ] `instructions.md`: when to use, how to interpret output, limitations
 - [ ] `card.json`: `issuer` matches manifest
-- [ ] `test_skill.py` passes
+- [ ] `test_skill.py` (bundle test) passes — `pytest skills/<category>/<skill_name>/test_skill.py`
 - [ ] `docs/skills/<skill_name>.md` and catalog row in `docs/skills/README.md`
 - [ ] **Usage Examples** on the catalog page (all five providers per [skill usage template](../usage/skill_usage_template.md)); link to `docs/usage/` and list skill `env_vars` without duplicating [api_keys.md](../usage/api_keys.md)
 - [ ] `pytest tests/test_skill_issuer.py` passes

diff --git a/examples/README.md b/examples/README.md
@@ -1,5 +1,7 @@
 # Skillware Examples Index
 
+> **These are usage examples, not tests.** Runnable provider demos live here; automated tests live in `skills/**/test_skill.py` (bundle) and `tests/` (framework and optional maintainer depth). See [TESTING.md](../docs/TESTING.md).
+
 Runnable examples in this directory show how to load Skillware skills, adapt
 them for a provider, execute local skill logic, and return tool results to an
 agent loop. Provider setup details live in the usage guides:

diff --git a/pyproject.toml b/pyproject.toml
@@ -70,6 +70,10 @@ include = ["skillware*", "skills*"]
 [tool.setuptools.package-data]
 skills = ["**/*"]
 
+[tool.pytest.ini_options]
+testpaths = ["tests", "skills"]
+addopts = "--ignore=examples"
+
 [tool.black]
 line-length = 88
 target-version = ["py310"]
diff --git a/templates/python_skill/README.md b/templates/python_skill/README.md
@@ -10,7 +10,7 @@ Starter bundle under `skills/<category>/<skill_name>/`. Copy this template from
 4. **`skill.py`**: Implement deterministic logic; no LLM-generated code in the skill body.
 5. **`instructions.md`**: Tell the agent when and how to use the tool.
 6. **`card.json`**: Mirror `issuer` from the manifest; customize UI fields.
-7. **`test_skill.py`**: Add tests; run `pytest skills/<category>/<skill_name>/test_skill.py`.
+7. **`test_skill.py`**: Bundle test (required); offline, mock externals; run `pytest skills/<category>/<skill_name>/test_skill.py`. See [TESTING.md](../../docs/TESTING.md).
 8. **`docs/skills/<skill_name>.md`**: Catalog page with **ID**, **Issuer**, and **Usage Examples** (all providers; see `docs/usage/skill_usage_template.md`).
 9. **`docs/skills/README.md`**: Add a row to the skill library table.