[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor — April 2026 Analysis #1675

2026-04-05T03:45:00Z

github-actions[bot]
bot Apr 5, 2026

📊 Executive Summary

The gh-aw-firewall repository is already a highly mature agentic automation environment — with 27 agentic workflows and 17 traditional CI workflows (44 total). It implements many Pelis Agent Factory patterns well, particularly in security monitoring, token analytics, smoke testing, and documentation maintenance. The primary gaps are a missing Issue Triage Agent, a Firewall Escape Test Agent referenced in code but absent, a Workflow Health Manager to address growing workflow failures/no-ops, and a handful of continuous code-quality agents.

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns Observed

Pattern	Description	Applied in this Repo?
Triage Agent	Auto-label/comment on new issues	❌ Missing
CI Doctor	Investigate failed CI, create actionable issues	✅ ci-doctor.md
Secret Scanner	Daily/scheduled scan for exposed credentials	✅ secret-digger-* (3 engines)
Firewall Validator	Test network security rules	⚠️ security-review.md references escape test agent that doesn't exist
Static Analysis Report	Daily zizmor/poutine/actionlint discussions	❌ Missing
Malicious Code Scan	Scan recent commits for suspicious patterns	❌ Missing
Issue Monster	Dispatch issues to coding agents	✅ issue-monster.md
Breaking Change Checker	Flag backward-incompatible changes	❌ Missing
Code Simplifier	Continuous code quality improvement via daily PRs	❌ Missing
Workflow Health Manager	Meta-agent monitoring all other agents	❌ Missing
Token Analytics	Track cost and optimize LLM usage	✅ 4-workflow ecosystem
Cross-repo Dispatcher	Route issues between repos	✅ firewall-issue-dispatcher.md
Pre-fetched Steps	Pre-compute context before agent runs	✅ security-guard, smoke-chroot
skip-if-match	Prevent duplicate runs	✅ used in multiple workflows
Cache Memory	Persistent state across runs	✅ issue-duplication-detector, ci-doctor
Release Note Automation	Auto-enrich release notes	✅ update-release-notes.md

How This Repo Compares

Relative to Pelis patterns, this repository exceeds most repositories in security automation (3 secret diggers, daily security review, dependency monitoring, security guard on PRs) and has strong foundations for token analytics, smoke testing, and cross-repo coordination. The main gaps align with the triage/issue management and continuous code quality categories.

📋 Current Agentic Workflow Inventory

Workflow	Purpose	Trigger	Assessment
`build-test`	Agentic build validation	PR / push	✅ Good
`ci-cd-gaps-assessment`	Assess CI/CD pipeline gaps	Daily	✅ Good
`ci-doctor`	Investigate CI failures	workflow_run failure	✅ Strong (cache-memory, detailed)
`claude-token-optimizer`	Optimize Claude token usage	After analyzer	✅ Good
`claude-token-usage-analyzer`	Analyze Claude costs	Daily	✅ Good
`cli-flag-consistency-checker`	Check CLI flag vs docs	Weekly	✅ Good
`copilot-token-optimizer`	Optimize Copilot token usage	After analyzer	✅ Good
`copilot-token-usage-analyzer`	Analyze Copilot costs	Daily	✅ Good
`dependency-security-monitor`	Monitor CVEs in deps	Daily	⚠️ Currently failing (#1673)
`doc-maintainer`	Sync docs with code changes	Daily	✅ Good
`firewall-issue-dispatcher`	Route gh-aw → gh-aw-firewall issues	Every 6h	✅ Good
`issue-duplication-detector`	Detect duplicate issues	issue opened	✅ Good (cache-memory)
`issue-monster`	Assign issues to Copilot agent	Hourly + issue opened	✅ Good
`pelis-agent-factory-advisor`	This workflow	Weekly	✅ This run
`plan`	/plan slash command	slash_command	✅ Good
`secret-digger-claude`	Red-team secret scanning (Claude)	Hourly	⚠️ No-op runs (#1671)
`secret-digger-codex`	Red-team secret scanning (Codex)	Hourly	⚠️ Failing (#1669)
`secret-digger-copilot`	Red-team secret scanning (Copilot)	Hourly	⚠️ No-op runs (#1668)
`security-guard`	PR security review	PR opened/sync	✅ Strong (pre-fetched diff)
`security-review`	Daily comprehensive security review	Daily	✅ Good (references missing escape test)
`smoke-chroot`	Smoke test chroot functionality	PR + reaction	✅ Good
`smoke-claude`	Smoke test Claude engine	PR + reaction	⚠️ Dependent on AWF build
`smoke-codex`	Smoke test Codex engine	PR + reaction	⚠️ Failing (#1674)
`smoke-copilot`	Smoke test Copilot engine	PR + reaction	✅ Good
`smoke-services`	Smoke test service dependencies	PR + reaction	✅ Good
`test-coverage-improver`	Improve test coverage	Weekly	✅ Good
`update-release-notes`	Enhance release notes	Release published	✅ Good

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

What: Auto-label new issues and leave a comment explaining the label decision and initial guidance.

Why: The repo has 100+ open issues but no automatic labeling. The issue-monster dispatches issues but doesn't categorize them. New contributors have no automated feedback. This is the most common "hello world" workflow in Pelis and addresses an immediate gap.

How: Create .github/workflows/issue-triage.md triggered on issues: types: [opened, reopened].

Effort: Low (30 mins, standard template)

Example:

---
timeout-minutes: 5
on:
  issues:
    types: [opened, reopened]
permissions:
  issues: read
tools:
  github:
    toolsets: [issues, labels]
    min-integrity: none
safe-outputs:
  add-labels:
    allowed: [bug, feature, enhancement, documentation, question, help-wanted, security, performance, iptables, squid, docker, api-proxy]
  add-comment:
    max: 1
---
# Issue Triage Agent
Analyze issue #$\{\{ github.event.issue.number }} in $\{\{ github.repository }}.
Research the codebase to understand the context, apply one or more appropriate labels,
and comment explaining the classification and how the issue relates to AWF's architecture
(Squid proxy, agent container, iptables, API proxy sidecar, or CLI).

P0.2: Firewall Escape Test Agent

What: A dedicated agentic workflow that attempts to escape the AWF firewall using known and novel techniques, reporting findings in a discussion.

Why: The security-review.md already references "Firewall Escape Test Agent" as a workflow to read — yet this workflow doesn't exist. For a firewall tool, this is a critical validation gap. The secret-digger workflows do related work inside the agent, but there's no dedicated workflow that systematically tests firewall escapes from the outside.

How: Create .github/workflows/firewall-escape-test.md that runs daily. It should test: DNS tunneling attempts, iptables bypass attempts, proxy circumvention, SSRF via allowed domains, and IPv6 gaps. It should use AWF to run itself (dogfooding).

Effort: Medium (2-3 hours, domain-specific security knowledge needed)

Example:

---
name: Firewall Escape Test Agent
description: Systematically tests AWF firewall for escape vulnerabilities
on:
  schedule: daily
  workflow_dispatch:
permissions:
  contents: read
  actions: read
  discussions: read
tools:
  bash: true
  agentic-workflows:
safe-outputs:
  create-discussion:
    title-prefix: "[Firewall Escape Test] "
    category: "general"
timeout-minutes: 30
---
# Firewall Escape Test Agent
Test the AWF firewall for known and novel escape techniques.
Run tests using `awf --build-local` against a controlled target...

P1 — Plan for Near-Term

P1.1: Workflow Health Manager (Meta-Agent)

What: A meta-agent that monitors the health of ALL agentic workflows, checks for patterns of failure/no-op, and creates issues or PRs to fix them.

Why: Currently, multiple workflows are silently failing or producing no-ops (issues #1673, #1668, #1630, #1669, #1674). There's no automated oversight of the workflow collection itself. The Pelis factory uses this to achieve self-healing — it contributed 40 issues and 34 PRs in the gh-aw repo.

How: Adapt githubnext/agentics workflow-health-manager.md for this repo. Schedule daily, check last 7 days of runs for each agentic workflow, identify patterns.

Effort: Medium

P1.2: Static Analysis Report

What: Daily discussion with findings from zizmor, poutine, and actionlint applied to the .lock.yml files.

Why: The repo already compiles agentic workflows to lock files — running static analysis on those lock files daily would catch security regressions (zizmor), supply chain risks (poutine), and syntax issues (actionlint) proactively. The Pelis factory created 57 such discussions.

How: Use the agenticworkflows-compile tool with --zizmor --poutine --actionlint in a scheduled workflow.

Effort: Low (the tools are already available in agenticworkflows-compile)

P1.3: Daily Malicious Code Scan

What: Daily scan of recent code changes for suspicious patterns (backdoors, credential exfiltration, obfuscated code, supply chain attacks).

Why: AWF has containers with entrypoints and iptables rules — a single malicious change could silently weaken the firewall. This is especially important given the security-sensitive nature of the codebase.

How: Adapt from github/gh-aw's daily-malicious-code-scan.md. Focus on: container entrypoints, iptables rules, Squid config generation, and dependency injections.

Effort: Low

P1.4: Breaking Change Checker

What: A PR-triggered workflow that detects backward-incompatible changes to CLI flags, container APIs, or TypeScript public interfaces.

Why: AWF is used as a library and CLI by other workflows. Breaking changes to --allow-domains, container environment variables, or the Docker Compose API can silently break downstream consumers. Issues like #1578 (deprecate pkg binary) show this is a real concern.

How: Trigger on PRs, analyze diff for: removed/renamed CLI flags, changed environment variable names, modified container image APIs, and breaking TypeScript type changes.

Effort: Medium

P1.5: Code Simplifier (Continuous Code Quality)

What: Daily agent that analyzes recently-modified code and proposes simplifications via PR.

Why: The codebase has grown complex (e.g., src/docker-manager.ts, containers/agent/entrypoint.sh). In the Pelis factory, the Code Simplifier had an 83% PR merge rate. For a security tool, keeping code simple is also a security property — complex code is harder to audit.

How: Adapt github/gh-aw's code-simplifier.md. Focus on TypeScript files, with extra attention to security-critical paths. Use skip-if-match to avoid duplicate PRs.

Effort: Low (standard template, good reference implementation exists)

P2 — Consider for Roadmap

P2.1: Schema Consistency Checker

What: Weekly detection of drift between TypeScript interfaces (types.ts), JSON schemas, test fixtures, and documentation.

Why: AWF has src/types.ts with WrapperConfig and other interfaces that need to stay in sync with docs/environment.md, CLI flags in src/cli.ts, and test fixtures in tests/fixtures/. The Pelis factory's Schema Consistency Checker created 55 analysis discussions.

Effort: Medium

P2.2: Changeset / Automated Version Bumper

What: When PRs are merged, automatically determine version bump (patch/minor/major) based on commit messages and create a changeset PR.

Why: update-release-notes.md enriches existing release notes but doesn't automate the version bumping process. The Pelis factory's Changeset workflow achieved 78% PR merge rate (22/28).

Effort: Medium

P2.3: Weekly Issue Summary

What: Weekly digest categorizing all open issues by component (Squid, iptables, agent-container, API proxy, CLI) with aging analysis.

Why: With 100+ open issues across multiple components, maintainers lack a quick way to see the current state. A weekly digest keeps the team informed and surfaces aging issues.

Effort: Low

P2.4: PR Nitpick Reviewer

What: Lightweight code review agent that checks PRs for common AWF-specific issues (missing tests for security paths, incomplete documentation updates, pattern violations).

Why: The security-guard already covers security-weakening changes; a complementary nitpick reviewer would catch quality issues. The Pelis factory's CLI Consistency Checker achieved 78% merge rate (80/102 PRs improved).

Effort: Low

P3 — Future Ideas

P3.1: Container Image Security Scanner

What: Weekly scan of the Docker images used (ubuntu/squid, ubuntu:22.04, node) for CVEs using Trivy or Grype, creating issues for HIGH/CRITICAL findings.

Why: The dependency-security-monitor covers npm dependencies but not the base container images. Container image vulnerabilities are equally important for a firewall tool.

Effort: Low (Trivy is already available in many GitHub Actions)

P3.2: Performance Regression Detector

What: Track AWF startup time and memory usage across releases, alert when metrics degrade.

Why: performance-monitor.yml exists as a traditional CI workflow but isn't agentic. An agentic version could provide richer analysis and historical context.

Effort: Low

P3.3: Contribution Guidelines Checker

What: On PR opened, verify the PR follows contribution guidelines (tests added, docs updated, commit format).

Why: Helps new contributors submit better PRs, reduces maintainer review burden.

Effort: Low

📈 Maturity Assessment

Dimension	Current Level	Notes
Overall	⭐⭐⭐⭐ 4/5	Significantly above average
Security automation	⭐⭐⭐⭐⭐ 5/5	Excellent — 7 security-focused workflows
Testing & Validation	⭐⭐⭐⭐ 4/5	Strong smoke tests + coverage improver
Documentation	⭐⭐⭐⭐ 4/5	Doc maintainer + CLI flag checker
Issue Management	⭐⭐⭐ 3/5	Monster + dispatcher but no triage
Code Quality	⭐⭐ 2/5	No continuous simplification agents
Observability	⭐⭐⭐⭐ 4/5	Strong token analytics, ci-doctor
Workflow Health	⭐⭐⭐ 3/5	Several workflows failing, no meta-agent

Target Level: 5/5 — achievable with P0/P1 items above

Gap Analysis:

Add issue triage (closes governance gap)
Add firewall escape test (closes security validation gap)
Add workflow health manager (closes operational gap)
Add static analysis + malicious code scan (closes security monitoring gap)
Add code simplifier (closes technical debt gap)

🔄 Comparison with Best Practices

What This Repository Does Well

Multi-engine secret scanning — 3 secret-digger variants (Claude, Codex, Copilot) running hourly is exceptional
Pre-computed context — security-guard pre-fetches PR diffs before the agent runs, reducing token waste
Cross-repo coordination — firewall-issue-dispatcher bridges gh-aw and gh-aw-firewall with a PAT
Token analytics ecosystem — 4 interconnected workflows (analyzer → optimizer) for both Claude and Copilot
Smoke testing all engines — validates the product itself as an agentic runtime on 4 engine variants
Cache-memory patterns — properly used for stateful workflows (issue dedup, CI doctor)
Domain-specific CI doctor — customized with AWF-specific failure patterns (Pool overlaps, awf-net orphans)

What Could Improve

Workflow Health — Several workflows are currently failing or producing no-ops (5+ known issues), suggesting the collection needs a meta-monitoring layer
Issue Governance — Despite having issue-monster for dispatch, there's no entry-point triage to label/categorize, making the issue backlog harder to navigate
Continuous Code Quality — No agents that propose code improvements daily (simplification, refactoring, dead code removal)
Firewall Self-Validation — A tool that secures other agents should have dedicated escape-testing workflows

Unique Domain Opportunities

Given AWF is both a security tool and a runtime for AI agents, it has a unique opportunity to dogfood its own technology for security validation:

Run escape tests through AWF to validate them
Use AWF to sandbox the malicious code scan agent itself
Run CVE impact analysis with the firewall restricting the analysis agent's network access

📝 Notes for Future Runs

Stored in /tmp/gh-aw/cache-memory/notes.txt for persistence.

Repository has 27 agentic workflows as of 2026-04-05
Multiple workflows failing: dep-security-monitor, smoke-codex, secret-digger-codex
Secret diggers producing no-ops — needs investigation
Firewall Escape Test Agent referenced in security-review.md but doesn't exist (key gap)
Issues [aw] No-Op Runs #1668, [agentics] No-Op Runs #1630 track no-op problems
Priority target: issue triage + firewall escape test + workflow health manager

AI generated by Pelis Agent Factory Advisor

expires on Apr 12, 2026, 3:45 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor — April 2026 Analysis #1675

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Pelis Agent Factory Advisor] Pelis Agent Factory Advisor — April 2026 Analysis #1675

Uh oh!

github-actions[bot] bot Apr 5, 2026

📊 Executive Summary

🎓 Patterns Learned from Pelis Agent Factory

Key Patterns Observed

How This Repo Compares

📋 Current Agentic Workflow Inventory

🚀 Actionable Recommendations

P0 — Implement Immediately

P0.1: Issue Triage Agent

P0.2: Firewall Escape Test Agent

P1 — Plan for Near-Term

P1.1: Workflow Health Manager (Meta-Agent)

P1.2: Static Analysis Report

P1.3: Daily Malicious Code Scan

P1.4: Breaking Change Checker

P1.5: Code Simplifier (Continuous Code Quality)

P2 — Consider for Roadmap

P2.1: Schema Consistency Checker

P2.2: Changeset / Automated Version Bumper

P2.3: Weekly Issue Summary

P2.4: PR Nitpick Reviewer

P3 — Future Ideas

P3.1: Container Image Security Scanner

P3.2: Performance Regression Detector

P3.3: Contribution Guidelines Checker

📈 Maturity Assessment

🔄 Comparison with Best Practices

What This Repository Does Well

What Could Improve

Unique Domain Opportunities

📝 Notes for Future Runs

Replies: 0 comments

github-actions[bot]
bot Apr 5, 2026