plexus.test_evaluation_metrics module

Focused tests for Evaluation.py prediction processing and metrics computation.

Tests the most critical business logic: - Metrics calculation accuracy - Label standardization - Confusion matrix building - Distribution calculations - Edge cases and error handling

class plexus.test_evaluation_metrics.MockScorecard(name='test_scorecard')

Bases: object

Mock scorecard for testing

__init__(name='test_scorecard')
get_accumulated_costs()
score_names()
class plexus.test_evaluation_metrics.TestConfusionMatrixBuilding

Bases: object

Test confusion matrix construction logic

test_binary_confusion_matrix_structure(mock_evaluation)

Test binary confusion matrix has correct structure

test_multiclass_confusion_matrix(mock_evaluation)

Test confusion matrix for multiclass classification

test_single_class_confusion_matrix(mock_evaluation)

Test confusion matrix when only one class is present

class plexus.test_evaluation_metrics.TestDistributionCalculations

Bases: object

Test predicted and actual label distribution calculations

test_actual_distribution_accuracy(mock_evaluation)

Test actual label distribution is calculated correctly

test_distribution_with_standardized_labels(mock_evaluation)

Test distribution calculation with label standardization

test_predicted_distribution_accuracy(mock_evaluation)

Test predicted label distribution is calculated correctly

class plexus.test_evaluation_metrics.TestErrorHandling

Bases: object

Test error handling in metrics computation

test_all_error_results(mock_evaluation)

Test handling when all results are errors

test_error_as_legitimate_class_label(mock_evaluation)

Test that ‘error’ as a class label (without error attribute) is counted in metrics

test_error_results_filtered_out(mock_evaluation)

Test that ERROR results are filtered out of metrics

test_missing_metadata_handling(mock_evaluation)

Test handling of results with missing metadata

class plexus.test_evaluation_metrics.TestLabelStandardization

Bases: object

Test label standardization and comparison logic

test_case_insensitive_matching(mock_evaluation)

Test that label matching is case insensitive

test_mixed_null_representations(mock_evaluation)

Test mixed null representations in actual vs predicted

test_null_value_standardization(mock_evaluation)

Test various null representations are standardized to ‘na’

test_whitespace_handling(mock_evaluation)

Test whitespace is handled in label comparison

class plexus.test_evaluation_metrics.TestMetricsCalculation

Bases: object

Test core metrics calculation logic

test_all_positive_predictions(mock_evaluation)

Test edge case where all predictions are positive

test_empty_results(mock_evaluation)

Test handling of empty results

test_imperfect_binary_classification(mock_evaluation)

Test binary classification with some errors

test_perfect_binary_classification(mock_evaluation)

Test perfect binary classification metrics

class plexus.test_evaluation_metrics.TestMultiScoreHandling

Bases: object

Test handling of multiple scores in evaluation

test_primary_score_filtering(mock_evaluation)

Test that only primary score is used for metrics when specified

plexus.test_evaluation_metrics.create_mock_evaluation_results(prediction_pairs, score_name='test_score')

Create mock evaluation results from (predicted, actual) pairs

plexus.test_evaluation_metrics.create_mock_score_result(predicted_value, actual_label, correct=None, score_name='test_score')

Create a mock Score.Result for testing

plexus.test_evaluation_metrics.mock_evaluation()

Create a mock evaluation instance