plexus.scores.Score module

class plexus.scores.Score.Score(**parameters)

Bases: ABC, ScoreData, ScoreVisualization

Abstract base class for implementing classification and scoring models in Plexus.

Score is the fundamental building block of classification in Plexus. Each Score represents a specific classification task and can be implemented using various approaches:

Machine learning models (e.g., DeepLearningSemanticClassifier)
LLM-based classification (e.g., LangGraphScore)
Rule-based systems (e.g., KeywordClassifier)
Custom logic (by subclassing Score)

The Score class provides: - Standard input/output interfaces using Pydantic models - Visualization tools for model performance - Cost tracking for API-based models - Metrics computation and logging

Common usage patterns: 1. Creating a custom classifier:

class MyClassifier(Score):

def predict(self, context, model_input: Score.Input) -> Score.Result:
text = model_input.text # Custom classification logic here return Score.Result(

parameters=self.parameters, value=”Yes” if is_positive(text) else “No”

)

Using in a Scorecard:

scores:

MyScore:
class: MyClassifier parameters:

threshold: 0.8
Training a model:
classifier = MyClassifier() classifier.train_model() classifier.evaluate_model() classifier.save_model()
Making predictions:

result = classifier.predict(context, Score.Input(
text=”content to classify”, metadata={“source”: “email”}

))

The Score class is designed to be extended for different classification approaches while maintaining a consistent interface for use in Scorecards and Evaluations.

Initialize the Score instance with the given parameters.

Parameters

**parametersdict: Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError: If the provided parameters do not pass validation.

class Input(*, text: str, metadata: dict = {}, results: List[Any] | None = None)

Bases: BaseModel

Standard input structure for all Score classifications in Plexus.

The Input class standardizes how content is passed to Score classifiers, supporting both the content itself and contextual metadata. It’s used by: - Individual Score predictions - Batch processing jobs - Evaluation runs - Dashboard API integrations

Attributes:

text: The content to classify. Can be a transcript, document, etc. metadata: Additional context like source, timestamps, or tracking IDs results: Optional list of previous classification results, used for:

Composite scores that depend on other scores

Multi-step classification pipelines

Score dependency resolution

Common usage: 1. Basic classification:

input = Score.Input(
text=”content to classify”, metadata={“source”: “phone_call”}

)

With dependencies:

input = Score.Input(
text=”content to classify”, metadata={“source”: “phone_call”}, results=[{

“name”: “PriorScore”, “result”: prior_score_result

}]

)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

metadata: dict

model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[Any] | None

text: str

Bases: BaseModel

Parameters required for scoring.

Attributes

datadict: Dictionary containing data-related parameters.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod convert_data_percentage(value)

Convert the percentage value in the data dictionary to a float.

Parameters

valuedict: Dictionary containing data-related parameters.

Returns

dict: Updated dictionary with the percentage value converted to float.

data: dict | None

dependencies: List[dict] | None

id: str | int | None

key: str | None

label_field: str | None

label_score_name: str | None

model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None

number_of_classes: int | None

scorecard_name: str | None

class Result(*, parameters: Parameters, value: str | bool, explanation: str | None = None, confidence: float | None = None, metadata: dict = {}, error: str | None = None, code: str | None = None)

Bases: BaseModel

Standard output structure for all Score classifications in Plexus.

The Result class provides a consistent way to represent classification outcomes, supporting both simple yes/no results and complex multi-class classifications with explanations. It’s used throughout Plexus for: - Individual Score results - Batch processing outputs - Evaluation metrics - Dashboard result tracking

Attributes:: parameters: Configuration used for this classification value: The classification result (e.g., “Yes”/”No” or class label) explanation: Detailed explanation of why this result was chosen confidence: Confidence score for the classification (0.0 to 1.0) metadata: Additional context about the classification error: Optional error message if classification failed

The Result class provides helper methods for common operations: - is_yes(): Check if result is affirmative - is_no(): Check if result is negative - __eq__: Compare results (case-insensitive)

Common usage: 1. Basic classification with explanation:

result = Score.Result(
parameters=self.parameters, value=”Yes”, explanation=”Clear greeting found at beginning of transcript”, confidence=0.95

)

Classification with metadata:

result = Score.Result(
parameters=self.parameters, value=”No”, explanation=”No greeting found in transcript”, confidence=0.88, metadata={“source”: “phone_call”}

)
Error case:

result = Score.Result(
parameters=self.parameters, value=”ERROR”, error=”API timeout”

)

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

code: str | None

confidence: float | None

property confidence_from_metadata: float | None: Backwards compatibility: confidence from metadata

error: str | None

explanation: str | None

property explanation_from_metadata: str | None: Backwards compatibility: explanation from metadata

is_no()

is_yes()

metadata: dict

model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

parameters: Score.Parameters

value: str | bool

exception SkippedScoreException(score_name: str, reason: str)

Bases: Exception

Raised when a score is skipped due to dependency conditions not being met.

__init__(score_name: str, reason: str)

__init__(**parameters)

Initialize the Score instance with the given parameters.

Parameters

**parametersdict: Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError: If the provided parameters do not pass validation.

static apply_processors_to_text(text: str, processors_config: list) → str

Apply a list of processors to text for production predictions.

This method applies the same processor pipeline used during training/evaluation to production prediction inputs. It creates a single-row DataFrame, applies all configured processors, and returns the processed text.

Args:

text: Input text to process processors_config: List of processor configurations, each with ‘class’ and optional ‘parameters’

Example: [{‘class’: ‘FilterCustomerOnlyProcessor’},
{‘class’: ‘RemoveSpeakerIdentifiersTranscriptFilter’}]

Returns:

Processed text after applying all processors

Example:

processors = [: {‘class’: ‘FilterCustomerOnlyProcessor’}, {‘class’: ‘RemoveSpeakerIdentifiersTranscriptFilter’}

] processed_text = Score.apply_processors_to_text(raw_text, processors)

evaluate_model()

Evaluate the model on the validation data.

Returns

dict: Dictionary containing evaluation metrics.

classmethod from_name(scorecard, score)

get_accumulated_costs()

Get the expenses that have been accumulated over all the computed elements.

Returns:: dict: Aggregated cost information with totals and components

get_label_score_name()

Determine the appropriate score name based on the parameters.

Returns

str: The determined score name.

property is_multi_class

Determine if the classification problem is multi-class.

This property checks the unique labels in the dataframe to determine if the problem is multi-class.

Returns

bool: True if the problem is multi-class, False otherwise.

is_relevant(text)

Determine if the given text is relevant using the predict method.

Parameters

textstr: The text to be classified.

Returns

bool: True if the text is classified as relevant, False otherwise.

classmethod load(scorecard_identifier: str, score_name: str, use_cache: bool = True, yaml_only: bool = False)

Load a single score configuration with configurable caching behavior.

Args:: scorecard_identifier: A string that identifies the scorecard (ID, name, key, or external ID) score_name: Name of the specific score to load use_cache: If True (default), cache API data to local YAML files. If False, don’t cache. yaml_only: If True, load only from local YAML files without API calls.
Returns:: Score: An initialized Score instance
Raises:: ValueError: If the score cannot be loaded

static log_validation_errors(error: ValidationError)

Log validation errors for the parameters.

Parameters

errorValidationError: The validation error object containing details about the validation failures.

model_directory_path()

property number_of_classes

Determine the number of classes for the classification problem.

This property checks the unique labels in the dataframe to determine the number of classes.

Returns

int: The number of unique classes.

abstractmethod predict(context, model_input: Input) → Result | List[Result]

Make predictions on the input data.

Parameters

contextAny: Context for the prediction
model_inputScore.Input: The input data for making predictions.

Returns

Union[Score.Result, List[Score.Result]]: Either a single Score.Result or a list of Score.Results

predict_validation()

Predict on the validation set.

This method should be implemented by subclasses to provide the prediction logic on the validation set.

record_configuration(configuration)

Record the provided configuration dictionary as a JSON file in the appropriate report folder for this model.

Parameters

configurationdict: Dictionary containing the configuration to be recorded.

register_model(): Register the model with the model registry.

report_directory_path()

report_file_name(file_name)

Generate the full path for a report file within the report directory.

Calling this function will implicitly trigger the function to ensure that the report directory exists.

Parameters

file_namestr: The name of the report file.

Returns

str: The full path to the report file with spaces replaced by underscores.

save_model(): Save the model to the model registry.

setup_label_map(labels)

Set up a mapping from labels to integers.

Parameters

labelslist: List of unique labels.

train_model(X_train, y_train, X_val, y_val)

Train the XGBoost model with the specified positive class weight.

Parameters

X_trainnumpy.ndarray: Training data features.
y_trainnumpy.ndarray: Training data labels.
X_valnumpy.ndarray: Validation data features.
y_valnumpy.ndarray: Validation data labels.