fix: deduplicate send_message_t fix: deduplicate send_message_to_user calls to prevent duplicate messages by Rrttttttt · Pull Request #8782 · AstrBotDevs/AstrBot

Rrttttttt · 2026-06-14T12:51:54Z

问题
某些 LLM（已知触发模型：mimo）在同一响应中同时返回 completion_text 和 send_message_to_user
工具调用，或在连续多次响应中重复调用 send_message_to_user，导致用户收到重复消息。
复现

使用 mimo 模型
在群聊中 @机器人
机器人偶尔会发送两条完全相同的消息
修复方案
tool_loop_agent_runner.py（根因修复）
当 LLM 响应中包含 send_message_to_user 工具调用时，抑制 completion_text / result_chain 的 yield，防止
RespondStage 重复发送相同内容。
message_tools.py（指纹去重）
在 send_message_to_user 工具内部，计算消息指纹（md5），30 秒内相同内容直接跳过，防止跨响应的重复调用。
测试
日志中看到 检测到 send_message_to_user 工具调用，抑制 completion_text 说明根因修复生效
日志中看到 [send_message_to_user] 检测到重复发送，已跳过 说明指纹去重生效

Summary by Sourcery

Prevent duplicate user-visible messages when LLMs return both text and send_message_to_user tool calls or repeatedly invoke the tool across responses.

Bug Fixes:

Suppress completion_text/result_chain output when a send_message_to_user tool call is present in the LLM response to avoid double-sending the same content.
Deduplicate send_message_to_user calls within a short time window using a message fingerprint to skip repeated sends across responses.

Enhancements:

Add logging around suppression and deduplication paths to aid in diagnosing duplicate message behavior.

…ages Some LLMs (notably mimo) may return both completion_text and send_message_to_user tool calls in the same response, or call send_message_to_user multiple times across consecutive responses. This causes the user to receive duplicate messages. Two-layer fix: 1. Suppress completion_text/result_chain yield when send_message_to_user is in the tool calls list, preventing RespondStage from sending the same content that the tool will send directly. 2. Add fingerprint-based dedup in send_message_to_user tool itself, skipping identical sends within a 30-second window. Co-Authored-By: Claude <noreply@anthropic.com>

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The global _recent_sends dict in message_tools.py is mutated without any synchronization, which could cause race conditions or inconsistent deduplication behavior under async or multi-threaded/multi-worker execution; consider scoping this per-run or adding a concurrency-safe structure.
The deduplication key currently uses only session and serialized messages; if metadata (e.g., channel, user, or message type) is relevant, you may want to incorporate it into the fingerprint to avoid unintentionally skipping distinct messages within the same session.
In tool_loop_agent_runner.step, _has_send_message_tool is computed from llm_resp.tools_call_name using substring membership; if tools_call_name can be a list or a non-string type, it may be safer to normalize it (e.g., to a list of strings) and check for exact matches to avoid false positives.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The global `_recent_sends` dict in `message_tools.py` is mutated without any synchronization, which could cause race conditions or inconsistent deduplication behavior under async or multi-threaded/multi-worker execution; consider scoping this per-run or adding a concurrency-safe structure.
- The deduplication key currently uses only `session` and serialized `messages`; if metadata (e.g., channel, user, or message type) is relevant, you may want to incorporate it into the fingerprint to avoid unintentionally skipping distinct messages within the same session.
- In `tool_loop_agent_runner.step`, `_has_send_message_tool` is computed from `llm_resp.tools_call_name` using substring membership; if `tools_call_name` can be a list or a non-string type, it may be safer to normalize it (e.g., to a list of strings) and check for exact matches to avoid false positives.

## Individual Comments

### Comment 1
<location path="astrbot/core/tools/message_tools.py" line_range="343" />
<code_context>
+                f"[send_message_to_user] 检测到重复发送，已跳过。"
+                f" session={session}, fingerprint={fingerprint[:8]}"
+            )
+            return f"Message skipped (duplicate), session={target_session}"
+        _recent_sends[fingerprint] = now
+
</code_context>
<issue_to_address>
**suggestion:** Clarify which session identifier is reported in the duplicate message response.

The duplicate path logs `session={session}` but returns `session={target_session}`, while the dedup key only uses `session`. If `session` and `target_session` can differ (e.g., broadcast/aliasing), this mismatch could confuse operators or downstream consumers. Please either standardize on one identifier (likely `target_session`) in both the log and return value, or clearly label both (e.g., `session=..., target_session=...`).

Suggested implementation:

```python
        if fingerprint in _recent_sends:
            logger.info(
                f"[send_message_to_user] 检测到重复发送，已跳过。"
                f" session={session}, target_session={target_session}, fingerprint={fingerprint[:8]}"
            )

```

```python
            return (
                "Message skipped (duplicate), "
                f"session={session}, target_session={target_session}"
            )

```

These edits assume that `target_session` is already defined in the same scope as the duplicate-check block. If it's not, you should either:
1. Pass `target_session` into this function (if conceptually distinct from `session`), or
2. Set `target_session = session` earlier in the function to keep the behavior consistent.

You may also want to update any unit tests or log parsers that rely on the exact log or return string format.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

gemini-code-assist

Code Review

This pull request introduces deduplication logic to prevent duplicate messages from being sent when an LLM (such as the mimo model) simultaneously returns completion text and calls the send_message_to_user tool, or calls the tool repeatedly within a short timeframe. It suppresses completion_text and result_chain yields when the tool is called, and implements a 30-second time-window MD5 fingerprint deduplication mechanism in message_tools.py. The review feedback suggests a robustness improvement: using the canonical target_session instead of the raw session string to calculate the deduplication fingerprint.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-06-14T12:52:56Z

+        fingerprint = hashlib.md5(
+            (str(session) + json.dumps(messages, ensure_ascii=False, sort_keys=True)).encode()
+        ).hexdigest()


Using str(session) to calculate the fingerprint can lead to duplicate messages not being properly deduplicated if the LLM calls the tool using different session formats (e.g., a raw session ID like "12345" vs. a fully qualified session string like "qq:group:12345"). Since target_session is the fully resolved MessageSession object, using str(target_session) ensures that the fingerprint is calculated using the canonical session representation, making the deduplication much more robust.

Suggested change

fingerprint = hashlib.md5(

(str(session) + json.dumps(messages, ensure_ascii=False, sort_keys=True)).encode()

).hexdigest()

fingerprint = hashlib.md5(

(str(target_session) + json.dumps(messages, ensure_ascii=False, sort_keys=True)).encode()

).hexdigest()

- Use asyncio.Lock for concurrency safety - Use target_session (canonical) instead of raw session in fingerprint - Add sender_id and platform_id to dedup key for finer granularity Co-Authored-By: Claude <noreply@anthropic.com>

zouyonghe

Blocking concern: this suppresses / whenever the model returns any tool call. can target a different session (admins are allowed to do that), so a model response that sends a notification elsewhere and also explains the action to the current user would now silently drop the current-user response. The same problem can happen when the tool payload differs from ; this is not necessarily a duplicate.

Please narrow the suppression to proven duplicates only, for example by comparing the tool target/message payload against the current response, or keep the runner behavior unchanged and rely on a scoped duplicate guard inside . The dedup guard itself should also be scoped to an agent run/tool-call context if possible, because a process-global 30s fingerprint can skip legitimate repeated sends of identical content.

zouyonghe

Blocking concern: this suppresses completion_text / result_chain whenever the model returns any send_message_to_user tool call. send_message_to_user can target a different session (admins are allowed to do that), so a model response that sends a notification elsewhere and also explains the action to the current user would now silently drop the current-user response. The same problem can happen when the tool payload differs from completion_text; this is not necessarily a duplicate.

Please narrow the suppression to proven duplicates only, for example by comparing the tool target/message payload against the current response, or keep the runner behavior unchanged and rely on a scoped duplicate guard inside send_message_to_user. The dedup guard itself should also be scoped to an agent run/tool-call context if possible, because a process-global 30s fingerprint can skip legitimate repeated sends of identical content.

Address review feedback: 1. Runner: Only suppress completion_text when its content matches the send_message_to_user payload exactly. Previously suppressed whenever the tool was present, which broke legitimate scenarios (e.g., admin sending notification to another session + explaining to current user). 2. Tool: Replace process-global 30s dedup dict with event-scoped fingerprint set. Each event tracks its own sent fingerprints via event.get_extra/set_extra, automatically cleaned up when the event is garbage collected. No false positives across different events. Co-Authored-By: Claude <noreply@anthropic.com>

Co-Authored-By: Claude <noreply@anthropic.com>

1. Check result_chain in addition to completion_text 2. Use set(existing) for explicit copy instead of relying on get_extra reference behavior 3. Normalize text before comparison (strip + merge whitespace) 4. Handle multiple send_message_to_user calls in same response 5. Log comparison details for debugging Co-Authored-By: Claude <noreply@anthropic.com>

The test mock used SimpleNamespace which doesn't have get_extra/set_extra methods. Replaced with _MockEvent class that supports the event extras API used by the dedup logic. Co-Authored-By: Claude <noreply@anthropic.com>

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Jun 14, 2026

sourcery-ai Bot reviewed Jun 14, 2026

View reviewed changes

Comment thread astrbot/core/tools/message_tools.py

gemini-code-assist Bot reviewed Jun 14, 2026

View reviewed changes

fix: address review feedback for dedup logic

a4b112d

- Use asyncio.Lock for concurrency safety - Use target_session (canonical) instead of raw session in fingerprint - Add sender_id and platform_id to dedup key for finer granularity Co-Authored-By: Claude <noreply@anthropic.com>

zouyonghe requested changes Jun 15, 2026

View reviewed changes

zouyonghe mentioned this pull request Jun 15, 2026

[Bug]gemini循环调用工具 #8773

Open

2 tasks

Rrttttttt and others added 3 commits June 15, 2026 12:51

fix: line length formatting for ruff

ba6df8f

Co-Authored-By: Claude <noreply@anthropic.com>

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Jun 15, 2026

fix: add get_extra/set_extra to test mock

d2c9478

The test mock used SimpleNamespace which doesn't have get_extra/set_extra methods. Replaced with _MockEvent class that supports the event extras API used by the dedup logic. Co-Authored-By: Claude <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: deduplicate send_message_t fix: deduplicate send_message_to_user calls to prevent duplicate messages#8782

fix: deduplicate send_message_t fix: deduplicate send_message_to_user calls to prevent duplicate messages#8782
Rrttttttt wants to merge 6 commits into
AstrBotDevs:masterfrom
Rrttttttt:fix/send-message-dedup

Rrttttttt commented Jun 14, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Uh oh!

zouyonghe left a comment

Uh oh!

zouyonghe left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Rrttttttt commented Jun 14, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by Sourcery

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

zouyonghe left a comment

Choose a reason for hiding this comment

Uh oh!

zouyonghe left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Rrttttttt commented Jun 14, 2026 •

edited by sourcery-ai Bot

Loading