plexus.cli.procedure.test_tool_explanation_enforcement_simple module

Simple BDD Tests for Tool Explanation Enforcement System

This module tests the critical behavior where the system forces the assistant to explain tool results before allowing further tool calls by temporarily removing tool access.

Key Behavior: 1. Tool executed → Result + Reminder injected → Tools REMOVED for next iteration 2. Assistant MUST provide explanation (no tools available) 3. Tools restored after explanation provided

class plexus.cli.procedure.test_tool_explanation_enforcement_simple.TestToolExplanationEnforcement

Bases: object

Test suite for the tool explanation enforcement system that prevents tool call chaining without explanation by temporarily removing tools.

These tests focus on the core logic patterns without complex integration.

test_complete_enforcement_cycle_logic()

Integration test for the complete tool explanation enforcement cycle logic:

Given: Normal operation with tools available When: Tool is executed Then: Reminder is injected and tools are removed for next response When: Explanation is provided Then: Tools are restored for subsequent responses

test_edge_case_no_tool_calls_in_response(): Given the assistant provides a text response without tool calls When that response is processed Then the force_explanation_next flag should not be set And normal flow should continue

test_enforcement_prevents_tool_chaining_pattern()

Integration test that demonstrates the prevention of tool chaining.

Before: Tool → Tool → Tool (bad) After: Tool → Explanation → Tool → Explanation (good)

test_explanation_flag_removes_tools_from_llm(): Given the force_explanation_next flag is True When the LLM is invoked for the next response Then tools should be temporarily removed (use coding_assistant_llm instead of llm_with_tools) And the flag should be cleared after use

test_logging_behavior_during_enforcement(): Given the tool explanation enforcement is active When tools are temporarily removed Then specific logging messages should be generated for debugging

test_multiple_tool_calls_enforcement(): Given the AI attempts to make multiple tool calls in a single response When the response is processed Then only the first tool call should be executed And the force_explanation_next flag should still be set correctly

test_reminder_message_injection_behavior(): Given a tool has been executed When the tool result is added to conversation history Then a reminder message should be injected immediately after the tool result And the reminder should contain specific required text

test_tool_execution_sets_explanation_flag(): Given a tool has been executed successfully When the tool result and reminder are added to conversation Then the force_explanation_next flag should be set to True

test_tools_restored_after_explanation(): Given an explanation has been provided (force_explanation_next was cleared) When the LLM is invoked in the subsequent round Then tools should be restored (use llm_with_tools normally)