Skip to content

Latest commit

Β 

History

History
519 lines (395 loc) Β· 20.1 KB

File metadata and controls

519 lines (395 loc) Β· 20.1 KB
description Collects and reports on firewall log events to monitor network security and access patterns
true
schedule workflow_dispatch
cron
daily
permissions
contents actions issues pull-requests discussions security-events
read
read
read
read
read
read
tracker-id daily-firewall-report
timeout-minutes 45
safe-outputs
upload-artifact
default-retention-days max-retention-days
30
30
tools
agentic-workflows github bash edit
toolsets
all
*
imports
uses with
shared/daily-audit-discussion.md
title-prefix
[daily-firewall-report]
shared/reporting.md
shared/trending-charts-simple.md
shared/observability-otlp.md

{{#runtime-import? .github/shared-instructions.md}}

Daily Firewall Logs Collector and Reporter

Collect and analyze firewall logs from all agentic workflows that use the firewall feature.

πŸ“Š Trend Charts Requirement

IMPORTANT: Generate exactly 2 trend charts that showcase firewall activity patterns over time.

Chart Generation Process

Phase 1: Data Collection

Collect data for the past 30 days (or available data) from firewall audit logs:

  1. Firewall Request Data:

    • Count of allowed requests per day
    • Count of blocked requests per day
    • Total requests per day
  2. Top Blocked Domains Data:

    • Frequency of top 10 blocked domains over the period
    • Trends in blocking patterns by domain category

Phase 2: Data Preparation

  1. Create CSV files in /tmp/gh-aw/python/data/ with the collected data:

    • firewall_requests.csv - Daily allowed/blocked request counts
    • blocked_domains.csv - Top blocked domains with frequencies
  2. Each CSV should have a date column and metric columns with appropriate headers

Phase 3: Chart Generation

Generate exactly 2 high-quality trend charts:

Chart 1: Firewall Request Trends

  • Stacked area chart or multi-line chart showing:
    • Allowed requests (area/line, green)
    • Blocked requests (area/line, red)
    • Total requests trend line
  • X-axis: Date (last 30 days)
  • Y-axis: Request count
  • Save as: /tmp/gh-aw/python/charts/firewall_requests_trends.png

Chart 2: Top Blocked Domains Frequency

  • Horizontal bar chart showing:
    • Top 10-15 most frequently blocked domains
    • Total block count for each domain
    • Color-coded by domain category if applicable
  • X-axis: Block count
  • Y-axis: Domain names
  • Save as: /tmp/gh-aw/python/charts/blocked_domains_frequency.png

Chart Quality Requirements:

  • DPI: 300 minimum
  • Figure size: 12x7 inches for better readability
  • Use seaborn styling with a professional color palette
  • Include grid lines for easier reading
  • Clear, large labels and legend
  • Title with context (e.g., "Firewall Activity - Last 30 Days")
  • Annotations for significant spikes or patterns

Phase 4: Upload Charts

  1. Stage both charts into the upload directory:
    cp /tmp/gh-aw/python/charts/firewall_trends.png /tmp/gh-aw/safeoutputs/upload-artifacts/
    cp /tmp/gh-aw/python/charts/blocked_domains.png /tmp/gh-aw/safeoutputs/upload-artifacts/
  2. Call the upload_artifact safe-output tool for each chart with retention_days: 30
  3. Record the returned aw_* IDs

Phase 5: Embed Charts in Discussion

Include the charts in your firewall report with this structure:

### πŸ“ˆ Firewall Activity Trends

### Request Patterns

πŸ“Ž **Chart: Firewall Request Trends** β€” artifact `<aw_ID_1>` available in the [workflow run artifacts](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})

[Brief 2-3 sentence analysis of firewall activity trends, noting increases in blocked traffic or changes in patterns]

### Top Blocked Domains

πŸ“Ž **Chart: Blocked Domains Frequency** β€” artifact `<aw_ID_2>` available in the [workflow run artifacts](https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }})

[Brief 2-3 sentence analysis of frequently blocked domains, identifying potential security concerns or overly restrictive rules]

Python Implementation Notes

  • Use pandas for data manipulation and date handling
  • Use matplotlib.pyplot and seaborn for visualization
  • Set appropriate date formatters for x-axis labels
  • Use plt.xticks(rotation=45) for readable date labels
  • Apply plt.tight_layout() before saving
  • Handle cases where data might be sparse or missing

Error Handling

If insufficient data is available (less than 7 days):

  • Generate the charts with available data
  • Add a note in the analysis mentioning the limited data range
  • Consider using a bar chart instead of line chart for very sparse data


Objective

Generate a comprehensive daily report of all rejected domains across all agentic workflows that use the firewall feature. This helps identify:

  • Which domains are being blocked
  • Patterns in blocked traffic
  • Potential issues with network permissions
  • Security insights from blocked requests

Instructions

MCP Servers are Pre-loaded

IMPORTANT: The MCP servers configured in this workflow (including gh-aw with tools like logs and audit) are automatically loaded and available at agent startup. You do NOT need to:

  • Use the inspector tool to discover MCP servers
  • Run any external tools to check available MCP servers
  • Verify or list MCP servers before using them

Simply call the MCP tools directly as described in the steps below. If you want to know what tools are available, you can list them using your built-in tool listing capability.

Step 0: Fresh Analysis - No Caching

ALWAYS PERFORM FRESH ANALYSIS: This report must always use fresh data from the audit tool.

DO NOT:

  • Skip analysis based on cached results
  • Reuse aggregated statistics from previous runs
  • Check for or use any cached run IDs, counts, or domain lists

ALWAYS:

  • Collect all workflow runs fresh using the logs tool
  • Fetch complete firewall data from the audit tool for each run
  • Compute all statistics fresh (blocked counts, allowed counts, domain lists)

This ensures accurate, up-to-date reporting for every run of this workflow.

Step 1: Collect Recent Firewall-Enabled Workflow Runs

Use the logs tool from the agentic-workflows MCP server to efficiently collect workflow runs that have firewall enabled (see workflow_runs_analyzed in scratchpad/metrics-glossary.md - Scope: Last 7 days):

Using the logs tool: Call the logs tool with the following parameters:

  • firewall: true (boolean - to filter only runs with firewall enabled)
  • start_date: "-7d" (to get runs from the past 7 days)
  • count: 100 (to get up to 100 matching runs)

The tool will:

  1. Filter runs based on the steps.firewall field in aw_info.json (e.g., "squid" when enabled)
  2. Return only runs where firewall was enabled
  3. Limit to runs from the past 7 days
  4. Return up to 100 matching runs

Tool call example:

{
  "firewall": true,
  "start_date": "-7d",
  "count": 100
}

Step 1.5: Early Exit if No Data

IMPORTANT: If Step 1 returns zero workflow runs (no firewall-enabled workflows ran in the past 7 days):

  1. Do NOT create a discussion or report
  2. Exit early with a brief log message: "No firewall-enabled workflow runs found in the past 7 days. Exiting without creating a report."
  3. Stop processing - do not proceed to Step 2 or any subsequent steps

This prevents creating empty or meaningless reports when there's no data to analyze.

Step 2: Analyze Firewall Logs from Collected Runs

For each run collected in Step 1:

  1. Use the audit tool from the agentic-workflows MCP server to get detailed firewall information
  2. Store the run ID, workflow name, and timestamp for tracking

Using the audit tool: Call the audit tool with the run_id parameter for each run from Step 1.

Tool call example:

{
  "run_id": 12345678
}

The audit tool returns structured firewall analysis data including:

  • Total requests, allowed requests, blocked requests
  • Lists of allowed and blocked domains
  • Request statistics per domain
  • Policy rule attribution (when policy-manifest.json and audit.jsonl artifacts are present)

Example of extracting firewall data from audit result:

// From the audit tool result, access:
result.firewall_analysis.blocked_domains  // Array of blocked domain names
result.firewall_analysis.allowed_domains  // Array of allowed domain names
result.firewall_analysis.total_requests   // Total number of network requests
result.firewall_analysis.blocked_requests  // Number of blocked requests

// Policy rule attribution (enriched data β€” may be null if artifacts are absent):
result.policy_analysis.policy_summary     // e.g., "12 rules, SSL Bump disabled, DLP disabled"
result.policy_analysis.rule_hits          // Array of {rule: {id, action, description, ...}, hits: N}
result.policy_analysis.denied_requests    // Array of {ts, host, status, rule_id, action, reason}
result.policy_analysis.total_requests     // Total enriched request count
result.policy_analysis.allowed_count      // Allowed requests (rule-attributed)
result.policy_analysis.denied_count       // Denied requests (rule-attributed)
result.policy_analysis.unique_domains     // Unique domain count

Important: Do NOT manually download and parse firewall log files. Always use the audit tool which provides structured firewall analysis data.

Step 3: Parse and Analyze Firewall Logs

Use the JSON output from the audit tool to extract firewall information.

Basic firewall analysis β€” The firewall_analysis field in the audit JSON contains:

  • total_requests - Total number of network requests
  • allowed_requests - Count of allowed requests
  • blocked_requests - Count of blocked requests
  • allowed_domains - Array of unique allowed domains
  • blocked_domains - Array of unique blocked domains
  • requests_by_domain - Object mapping domains to request statistics (allowed/blocked counts)

Policy rule attribution β€” The policy_analysis field (when present) contains enriched data that attributes each request to a specific firewall policy rule:

  • policy_summary - Human-readable summary (e.g., "12 rules, SSL Bump disabled, DLP disabled")
  • rule_hits - Array of objects: {rule: {id, order, action, aclName, protocol, domains, description}, hits: N} β€” how many requests each rule handled
  • denied_requests - Array of objects: {ts, host, status, rule_id, action, reason} β€” every denied request with its matching rule and reason
  • total_requests - Total enriched request count
  • allowed_count - Allowed requests count (rule-attributed)
  • denied_count - Denied requests count (rule-attributed)
  • unique_domains - Unique domain count

Note: policy_analysis is only present when the workflow run produced policy-manifest.json and audit.jsonl artifacts. If absent, fall back to firewall_analysis for basic domain-count data.

Example jq filter for aggregating blocked domains:

# Get only blocked domains across multiple runs
gh aw audit <run-id> --json | jq -r '.firewall_analysis.blocked_domains[]? // empty'

# Get blocked domain statistics with counts
gh aw audit <run-id> --json | jq -r '
  .firewall_analysis.requests_by_domain // {} | 
  to_entries[] | 
  select(.value.blocked > 0) | 
  "\(.key): \(.value.blocked) blocked, \(.value.allowed) allowed"
'

# Get policy rule hit counts (when policy_analysis is available)
gh aw audit <run-id> --json | jq -r '
  .policy_analysis.rule_hits // [] |
  .[] | select(.hits > 0) |
  "\(.rule.id) (\(.rule.action)): \(.hits) hits"
'

# Get denied requests with rule attribution
gh aw audit <run-id> --json | jq -r '
  .policy_analysis.denied_requests // [] |
  .[] | "\(.host) β†’ \(.rule_id): \(.reason)"
'

For each workflow run with firewall data (see standardized metric names in scratchpad/metrics-glossary.md):

  1. Extract the firewall analysis from the audit JSON output
  2. Track the following metrics per workflow:
    • Total requests (firewall_requests_total)
    • Allowed requests count (firewall_requests_allowed)
    • Blocked requests count (firewall_requests_blocked)
    • List of unique blocked domains (firewall_domains_blocked)
    • Domain-level statistics (from requests_by_domain)
  3. If policy_analysis is present, also track:
    • Policy rule hit counts (which rules are handling traffic)
    • Denied requests with rule attribution (which rule denied each request and why)
    • Policy summary (rules count, SSL Bump/DLP status)

Step 4: Aggregate Results

Combine data from all workflows (using standardized metric names):

  1. Create a master list of all blocked domains across all workflows
  2. Track how many times each domain was blocked
  3. Track which workflows blocked which domains
  4. Calculate overall statistics:
    • Total workflows analyzed (workflow_runs_analyzed - Scope: Last 7 days)
    • Total runs analyzed
    • Total blocked domains (firewall_domains_blocked) - unique count
    • Total blocked requests (firewall_requests_blocked)

Policy rule attribution aggregation (when policy_analysis data is available): 5. Aggregate policy rule hit counts across all runs:

  • Build a cross-run rule hit table: rule ID β†’ total hits across all runs
  • Identify the most active allow rules and deny rules
  1. Aggregate denied requests with rule attribution:
    • Collect all denied requests across runs with their matching rule and reason
    • Group by rule ID to show which deny rules are doing the most work
    • Group by domain to show which domains trigger which deny rules
  2. Track policy configuration across runs:
    • Note any runs with SSL Bump or DLP enabled
    • Note any differences in policy rule counts between runs

Step 5: Generate Report

Create a comprehensive markdown report following the formatting guidelines above. Structure your report as follows:

Section 1: Executive Summary (Always Visible)

A brief 1-2 paragraph overview including:

  • Date of report (today's date)
  • Total workflows analyzed (workflow_runs_analyzed)
  • Total runs analyzed
  • Overall firewall activity snapshot (key highlights, trends, concerns)

Section 2: Key Metrics (Always Visible)

Present the core statistics:

  • Total network requests monitored (firewall_requests_total)
    • βœ… Allowed (firewall_requests_allowed): Count of successful requests
    • 🚫 Blocked (firewall_requests_blocked): Count of blocked requests
  • Block rate: Percentage of blocked requests (blocked / total * 100)
  • Total unique blocked domains (firewall_domains_blocked)

Terminology Note:

  • Allowed requests = Requests that successfully reached their destination
  • Blocked requests = Requests that were prevented by the firewall
  • A 0% block rate with listed blocked domains indicates domains that would be blocked if accessed, but weren't actually accessed during this period

Section 3: Top Blocked Domains (Always Visible)

A table showing the most frequently blocked domains:

  • Domain name
  • Number of times blocked
  • Workflows that blocked it
  • Domain category (Development Services, Social Media, Analytics/Tracking, CDN, Other)

Sort by frequency (most blocked first), show top 20.

Section 4: Policy Rule Attribution (Always Visible β€” when data available)

Include this section when policy_analysis data was available for at least one run.

This section provides rule-level insights that go beyond simple domain counts, showing which policy rules are handling traffic and why specific requests were denied.

4a. Policy Configuration

Show the policy summary from the most recent run:

  • Number of rules, SSL Bump status, DLP status
  • Example: "πŸ“‹ Policy: 12 rules, SSL Bump disabled, DLP disabled"

4b. Policy Rule Hit Table

Show aggregated rule hit counts across all analyzed runs:

| Rule | Action | Description | Total Hits |
|------|--------|-------------|------------|
| allow-github | 🟒 allow | Allow GitHub domains | 523 |
| allow-npm | 🟒 allow | Allow npm registry | 187 |
| deny-blocked-plain | πŸ”΄ deny | Deny all other HTTP/HTTPS | 12 |
| deny-default | πŸ”΄ deny | Default deny | 3 |
  • Sort by hits (descending)
  • Include all rules that had at least 1 hit
  • Use 🟒 for allow rules and πŸ”΄ for deny rules in the Action column

4c. Denied Requests with Rule Attribution

Show denied requests grouped by rule, with domain details:

| Domain | Deny Rule | Reason | Occurrences |
|--------|-----------|--------|-------------|
| evil.com:443 | deny-blocked-plain | Domain not in allowlist | 5 |
| tracker.io:443 | deny-blocked-plain | Domain not in allowlist | 3 |
| unknown.host:80 | deny-default | Default deny | 1 |
  • Group by domain + rule combination
  • Sort by occurrences (descending)
  • Show top 30 entries; wrap the full list in <details> if more than 30

4d. Rule Effectiveness Summary

Provide a brief analysis:

  • Which deny rules are doing the most work (catching the most unauthorized traffic)
  • Which allow rules handle the most traffic (busiest legitimate pathways)
  • Any rules with zero hits that could be removed or indicate unused policy entries
  • Any (implicit-deny) attributions that indicate gaps in the policy (traffic denied without matching any explicit rule)

Section 5: Detailed Request Patterns (In <details> Tags)

IMPORTANT: Wrap this entire section in a collapsible <details> block:

<details>
<summary>View Detailed Request Patterns by Workflow</summary>

For each workflow that had blocked domains, provide a detailed breakdown:

#### Workflow: [workflow-name] (X runs analyzed)

| Domain | Blocked Count | Allowed Count | Block Rate | Category |
|--------|---------------|---------------|------------|----------|
| example.com | 15 | 5 | 75% | Social Media |
| api.example.org | 10 | 0 | 100% | Development |

- Total blocked requests: [count]
- Total unique blocked domains: [count]
- Most frequently blocked domain: [domain]

[Repeat for all workflows with blocked domains]

</details>

Section 6: Complete Blocked Domains List (In <details> Tags)

IMPORTANT: Wrap this entire section in a collapsible <details> block:

<details>
<summary>View Complete Blocked Domains List</summary>

An alphabetically sorted list of all unique blocked domains:

| Domain | Total Blocks | First Seen | Workflows |
|--------|--------------|------------|-----------|
| [domain] | [count] | [date] | [workflow-list] |
| ... | ... | ... | ... |

</details>

Section 7: Security Recommendations (Always Visible)

Based on the analysis, provide actionable insights:

  • Domains that appear to be legitimate services that should be allowlisted
  • Potential security concerns (e.g., suspicious domains)
  • Suggestions for network permission improvements
  • Workflows that might need their network permissions updated
  • Policy rule suggestions (e.g., rules with zero hits that could be removed, domains that should be added to allow rules)

Step 6: Create Discussion

Create a new GitHub discussion with:

  • Title: "Daily Firewall Report - [Today's Date]"
  • Category: audits
  • Body: The complete markdown report following the formatting guidelines and structure defined in Step 5

Ensure the discussion body:

  • Uses h3 (###) for main section headers
  • Uses h4 (####) for subsection headers
  • Wraps detailed data (per-workflow breakdowns, complete domain list) in <details> tags
  • Keeps critical information visible (summary, key metrics, top domains, recommendations)

Notes

  • Early exit: If no firewall-enabled workflow runs are found in the past 7 days, exit early without creating a report (see Step 1.5)
  • Include timestamps and run URLs for traceability
  • Use tables and formatting for better readability
  • Add emojis to make the report more engaging (πŸ”₯ for firewall, 🚫 for blocked, βœ… for allowed)

Expected Output

A GitHub discussion in the "audits" category containing a comprehensive daily firewall analysis report.

Important: If no action is needed after completing your analysis, you MUST call the noop safe-output tool with a brief explanation. Failing to call any safe-output tool is the most common cause of safe-output workflow failures.

{"noop": {"message": "No action needed: [brief explanation of what was analyzed and why]"}}