plexus.cli.procedure.test_sop_agent_coin_flip_scenario module
BDD Test Suite: SOP Agent Coin Flip Scenario
This module provides comprehensive testing of the StandardOperatingProcedureAgent using a simple, controllable scenario: conducting a coin flip procedure.
The scenario tests all core SOP agent features: 1. Worker agent tool access management 2. Tool explanation enforcement 3. Stop tool functionality 4. Manager agent coaching 5. Procedure completion detection 6. Chat recording integration
Test Scenario: 1. Worker calls coin_flip tool 3 times 2. Worker calls data_logging tool to record each result 3. Worker calls accuracy_calculator tool to compute results 4. Worker calls stop_procedure tool to finish 5. Manager provides coaching guidance throughout
- class plexus.cli.procedure.test_sop_agent_coin_flip_scenario.CoinFlipProcedureDefinition
Bases:
ProcedureDefinitionSimple procedure definition for coin flip procedure.
This demonstrates how to create a custom procedure with: - Specific tool subset for worker agent - Custom prompts for the task - Simple completion criteria
Initialize with coin flip procedure tools.
- __init__()
Initialize with coin flip procedure tools.
- get_allowed_tools() List[str]
Get allowed tools for worker agent.
- get_completion_summary(state_data: Dict[str, Any]) str
Get completion summary.
- get_sop_guidance_prompt(context: Dict[str, Any], state_data: Dict[str, Any]) str
Get SOP manager guidance prompt.
- get_system_prompt(context: Dict[str, Any]) str
Get worker agent system prompt for coin flip procedure.
- get_user_prompt(context: Dict[str, Any]) str
Get initial user prompt for coin flip procedure.
- should_continue(state_data: Dict[str, Any]) bool
Determine if procedure should continue.
- class plexus.cli.procedure.test_sop_agent_coin_flip_scenario.MockCoinFlipChatRecorder
Bases:
ChatRecorderMock chat recorder for testing.
- __init__()
- async end_session(status: str, name: str = None) bool
End recording session.
- async record_message(role: str, content: str, message_type: str) str | None
Record a message.
- async record_system_message(content: str) str | None
Record a system message.
- async start_session(context: Dict[str, Any]) str | None
Start recording session.
- class plexus.cli.procedure.test_sop_agent_coin_flip_scenario.MockCoinFlipFlowManager
Bases:
FlowManagerMock flow manager for coin flip procedure.
- __init__()
- get_completion_summary() str
Get completion summary.
- get_next_guidance() str | None
Get guidance for next step.
- should_continue() bool
Check if flow should continue.
- update_state(new_data: Dict[str, Any]) Dict[str, Any]
Update and return current state.
- class plexus.cli.procedure.test_sop_agent_coin_flip_scenario.TestSOPAgentCoinFlipScenario
Bases:
objectComprehensive BDD test suite for SOP agent using coin flip scenario.
This tests all core SOP agent functionality: - Tool access management - Tool explanation enforcement - Manager coaching - Stop functionality - Chat recording - Procedure completion
- coin_flip_procedure_definition()
Create coin flip procedure definition.
- test_coin_flip_scenario_story_complete_workflow()
Story Test: Complete coin flip procedure workflow
This test tells the complete story of using an SOP agent to accomplish the coin flip task, demonstrating all the key features working together.
- test_given_coin_flip_procedure_when_checking_continuation_then_respects_stop_and_safety_limits(coin_flip_procedure_definition)
Given a coin flip procedure definition When checking if procedure should continue Then it should respect stop requests and safety limits
- test_given_coin_flip_procedure_when_generating_sop_guidance_then_provides_contextual_coaching(coin_flip_procedure_definition)
Given a coin flip procedure definition When generating SOP guidance at different stages Then it should provide contextual coaching questions
- test_given_coin_flip_procedure_when_getting_prompts_then_provides_task_specific_guidance(coin_flip_procedure_definition)
Given a coin flip procedure definition When getting system and user prompts Then it should provide task-specific guidance for the coin flip experiment
- test_given_coin_flip_procedure_when_initialized_then_has_correct_tools(coin_flip_procedure_definition)
Given a coin flip procedure definition When the procedure is initialized Then it should have the correct subset of tools available
- test_given_coin_flip_scenario_when_checking_stop_tool_functionality_then_stops_correctly(coin_flip_procedure_definition)
Given a coin flip scenario When the stop tool is used Then the procedure should stop correctly with proper reason tracking
- test_given_multiple_coin_flip_procedures_when_comparing_configurations_then_demonstrates_customization()
Given multiple coin flip procedure configurations When comparing their setup Then it should demonstrate how the base SOP agent can be customized for different tasks