Skip to content

feat: add Keenable web search and extract tools#8802

Open
IlyaGusev wants to merge 3 commits into
AstrBotDevs:masterfrom
IlyaGusev:feat/keenable-web-search
Open

feat: add Keenable web search and extract tools#8802
IlyaGusev wants to merge 3 commits into
AstrBotDevs:masterfrom
IlyaGusev:feat/keenable-web-search

Conversation

@IlyaGusev

@IlyaGusev IlyaGusev commented Jun 15, 2026

Copy link
Copy Markdown

Closes #8801

Adds Keenable (https://keenable.ai) as a web search provider so the LLM can search the web and fetch page content through Keenable's API. This gives users another first-class search backend alongside Tavily / BoCha / Brave / Firecrawl / Baidu, following the exact same built-in tool pattern.

Keenable builds its own independent web index (not a Google/Bing reseller) with retrieval primitives aimed at AI agents. API usage is currently free; users just need an X-API-Key from keenable.ai.

Modifications / 改动点

Two tools, gated on provider_settings.websearch_provider == "keenable":

Tool Endpoint Method Auth
web_search_keenable https://api.keenable.ai/v1/search POST X-API-Key
keenable_extract_web_page https://api.keenable.ai/v1/fetch GET X-API-Key

Core files modified:

  • astrbot/core/tools/web_search_tools.py_keenable_search / _keenable_fetch, KeenableWebSearchTool, KeenableExtractWebPageTool, key rotator, tool-name list, and legacy str→list key migration.
  • astrbot/core/astr_main_agent.py — import + keenable branch in _apply_web_search_tools.
  • astrbot/core/config/default.py — default websearch_keenable_key: [], keenable provider option, and gated config-schema entry.
  • dashboard/src/i18n/locales/{en-US,ru-RU,zh-CN}/features/config-metadata.json — key label/hint.
  • Tests: tests/unit/test_web_search_tools.py, tests/unit/test_func_tool_manager.py, tests/unit/test_astr_main_agent.py.

API keys live in provider_settings.websearch_keenable_key (list) with the same round-robin _KeyRotator as the other providers. Keenable's response maps 1:1 onto the existing SearchResult abstraction (snippet falls back to description); fetch reuses the same URL/Content output shape as the Tavily/Firecrawl extract tools.

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Unit tests + lint:

$ uv run ruff format --check .
473 files already formatted
$ uv run ruff check .
All checks passed!

$ uv run pytest tests/unit
603 passed

Manual end-to-end against the live Keenable API (actual tool code, not mocks):

  • KeenableWebSearchTool.call(query, site) → live POST /v1/search returned a valid JSON payload of 10 mapped results ✅
  • KeenableExtractWebPageTool.call(url, max_chars) → live GET /v1/fetch returned URL: …\nContent: <markdown>
  • Missing key → friendly error, no API call ✅
  • Bad key → live 401 propagated as Keenable web search failed: …
  • Zero results → does not return any results message ✅
  • Verified live response fields match the parser: search title/url/description/snippet/acquired_at; fetch url/title/content/description/author.

New unit tests cover: result mapping, X-API-Key header + description fallback, GET-based fetch, no-content handling, HTTP-error propagation (search & fetch), legacy-config migration, builtin-tool registration, and the dispatch path injecting both tools.


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

    Tracked in [Feature] Add Keenable as a built-in web search provider #8801.

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库 — only stdlib + already-present aiohttp are used.

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Add Keenable as a configurable web search provider with built-in search and page-extraction tools integrated into the agent and configuration system.

New Features:

  • Introduce Keenable-based web search tool for querying web content via the Keenable Search API.
  • Add Keenable-based web page extraction tool for fetching and returning page content from Keenable-indexed URLs.
  • Expose Keenable as a selectable websearch provider with configurable API key list in provider settings and dashboard metadata.

Enhancements:

  • Normalize legacy web search configuration to support list-based Keenable API keys alongside existing providers.
  • Register Keenable search and extract tools as built-in tools and wire them into the main agent web search tool selection flow.

Tests:

  • Extend unit test coverage for Keenable search and fetch behavior, error handling, legacy config migration, builtin-tool registration, and agent tool injection.

Add Keenable as a web search provider, following the existing builtin
tool pattern (Tavily/Firecrawl). Exposes two tools gated on
`websearch_provider == "keenable"`:

- `web_search_keenable`  -> POST https://api.keenable.ai/v1/search
- `keenable_extract_web_page` -> GET https://api.keenable.ai/v1/fetch

Auth via `X-API-Key` header with key rotation through
`provider_settings.websearch_keenable_key` (list). Wires the config
schema/default, dispatch in `_apply_web_search_tools`, i18n metadata
(en-US/ru-RU/zh-CN), and unit tests for mapping, headers, GET fetch,
and HTTP error handling.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jun 15, 2026

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The Keenable test helpers reuse _FakeFirecrawlResponse and are named _FakeKeenableSession, which is a bit confusing; consider renaming or extracting generic HTTP fakes so provider-specific tests don’t depend on Firecrawl naming.
  • Both _keenable_search and _keenable_fetch duplicate the same error-wrapping pattern as other providers; you might want to extract a small shared helper for HTTP error handling to reduce repetition and keep future provider additions consistent.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The Keenable test helpers reuse `_FakeFirecrawlResponse` and are named `_FakeKeenableSession`, which is a bit confusing; consider renaming or extracting generic HTTP fakes so provider-specific tests don’t depend on Firecrawl naming.
- Both `_keenable_search` and `_keenable_fetch` duplicate the same error-wrapping pattern as other providers; you might want to extract a small shared helper for HTTP error handling to reduce repetition and keep future provider additions consistent.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for Keenable as a web search and webpage extraction provider, adding the KeenableWebSearchTool and KeenableExtractWebPageTool along with their corresponding configurations, localizations, and unit tests. The feedback identifies a potential TypeError in _keenable_search if the API returns a null value for the results key, and suggests a safer fallback pattern to handle null or missing values robustly.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines +402 to +410
return [
SearchResult(
title=item.get("title", ""),
url=item.get("url", ""),
snippet=item.get("snippet") or item.get("description") or "",
)
for item in data.get("results", [])
if item.get("url")
]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using data.get("results", []) can lead to a TypeError: 'NoneType' object is not iterable if the API returns {"results": null} (which parses to None in Python). Since dict.get() only returns the default value when the key is absent, it will return None if the key is present but has a null value.

Using data.get("results") or [] is a safer and more robust pattern that handles both missing keys and null/None values gracefully. Additionally, we should check if item is not None before calling item.get("url") to prevent potential AttributeErrors.

Suggested change
return [
SearchResult(
title=item.get("title", ""),
url=item.get("url", ""),
snippet=item.get("snippet") or item.get("description") or "",
)
for item in data.get("results", [])
if item.get("url")
]
return [
SearchResult(
title=item.get("title", ""),
url=item.get("url", ""),
snippet=item.get("snippet") or item.get("description") or "",
)
for item in (data.get("results") or [])
if item and item.get("url")
]

IlyaGusev and others added 2 commits June 15, 2026 12:17
Use `data.get("results") or []` so a `{"results": null}` API response
yields an empty list instead of raising TypeError on iteration, and skip
null items before reading `url`. Addresses Gemini review feedback on AstrBotDevs#8802.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Identify AstrBot as the calling application on Keenable search and fetch
requests via the X-Keenable-Title header.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Add Keenable as a built-in web search provider

1 participant