plexus.test_evaluation_metrics module
Focused tests for Evaluation.py prediction processing and metrics computation.
Tests the most critical business logic: - Metrics calculation accuracy - Label standardization - Confusion matrix building - Distribution calculations - Edge cases and error handling
- class plexus.test_evaluation_metrics.MockScorecard(name='test_scorecard')
Bases:
objectMock scorecard for testing
- __init__(name='test_scorecard')
- get_accumulated_costs()
- score_names()
- class plexus.test_evaluation_metrics.TestConfusionMatrixBuilding
Bases:
objectTest confusion matrix construction logic
- test_binary_confusion_matrix_structure(mock_evaluation)
Test binary confusion matrix has correct structure
- test_multiclass_confusion_matrix(mock_evaluation)
Test confusion matrix for multiclass classification
- test_single_class_confusion_matrix(mock_evaluation)
Test confusion matrix when only one class is present
- class plexus.test_evaluation_metrics.TestDistributionCalculations
Bases:
objectTest predicted and actual label distribution calculations
- test_actual_distribution_accuracy(mock_evaluation)
Test actual label distribution is calculated correctly
- test_distribution_with_standardized_labels(mock_evaluation)
Test distribution calculation with label standardization
- test_predicted_distribution_accuracy(mock_evaluation)
Test predicted label distribution is calculated correctly
- class plexus.test_evaluation_metrics.TestErrorHandling
Bases:
objectTest error handling in metrics computation
- test_all_error_results(mock_evaluation)
Test handling when all results are errors
- test_error_as_legitimate_class_label(mock_evaluation)
Test that ‘error’ as a class label (without error attribute) is counted in metrics
- test_error_results_filtered_out(mock_evaluation)
Test that ERROR results are filtered out of metrics
- test_missing_metadata_handling(mock_evaluation)
Test handling of results with missing metadata
- class plexus.test_evaluation_metrics.TestLabelStandardization
Bases:
objectTest label standardization and comparison logic
- test_case_insensitive_matching(mock_evaluation)
Test that label matching is case insensitive
- test_mixed_null_representations(mock_evaluation)
Test mixed null representations in actual vs predicted
- test_null_value_standardization(mock_evaluation)
Test various null representations are standardized to ‘na’
- test_whitespace_handling(mock_evaluation)
Test whitespace is handled in label comparison
- class plexus.test_evaluation_metrics.TestMetricsCalculation
Bases:
objectTest core metrics calculation logic
- test_all_positive_predictions(mock_evaluation)
Test edge case where all predictions are positive
- test_empty_results(mock_evaluation)
Test handling of empty results
- test_imperfect_binary_classification(mock_evaluation)
Test binary classification with some errors
- test_perfect_binary_classification(mock_evaluation)
Test perfect binary classification metrics
- class plexus.test_evaluation_metrics.TestMultiScoreHandling
Bases:
objectTest handling of multiple scores in evaluation
- test_primary_score_filtering(mock_evaluation)
Test that only primary score is used for metrics when specified
- plexus.test_evaluation_metrics.create_mock_evaluation_results(prediction_pairs, score_name='test_score')
Create mock evaluation results from (predicted, actual) pairs
- plexus.test_evaluation_metrics.create_mock_score_result(predicted_value, actual_label, correct=None, score_name='test_score')
Create a mock Score.Result for testing
- plexus.test_evaluation_metrics.mock_evaluation()
Create a mock evaluation instance