plexus.scores.Score module

class plexus.scores.Score.Score(**parameters)

Bases: ABC, ScoreData, ScoreVisualization

Abstract base class for implementing classification and scoring models in Plexus.

Score is the fundamental building block of classification in Plexus. Each Score represents a specific classification task and can be implemented using various approaches:

  • Machine learning models (e.g., DeepLearningSemanticClassifier)

  • LLM-based classification (e.g., LangGraphScore)

  • Rule-based systems (e.g., KeywordClassifier)

  • Custom logic (by subclassing Score)

The Score class provides: - Standard input/output interfaces using Pydantic models - Visualization tools for model performance - Cost tracking for API-based models - Metrics computation and logging

Common usage patterns: 1. Creating a custom classifier:

class MyClassifier(Score):
def predict(self, context, model_input: Score.Input) -> Score.Result:

text = model_input.text # Custom classification logic here return Score.Result(

parameters=self.parameters, value=”Yes” if is_positive(text) else “No”

)

  1. Using in a Scorecard:
    scores:
    MyScore:

    class: MyClassifier parameters:

    threshold: 0.8

  2. Training a model:

    classifier = MyClassifier() classifier.train_model() classifier.evaluate_model() classifier.save_model()

  3. Making predictions:
    result = classifier.predict(context, Score.Input(

    text=”content to classify”, metadata={“source”: “email”}

    ))

The Score class is designed to be extended for different classification approaches while maintaining a consistent interface for use in Scorecards and Evaluations.

Initialize the Score instance with the given parameters.

Parameters

**parametersdict

Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError

If the provided parameters do not pass validation.

class Input(*, text: str, metadata: dict = {}, results: List[Any] | None = None)

Bases: BaseModel

Standard input structure for all Score classifications in Plexus.

The Input class standardizes how content is passed to Score classifiers, supporting both the content itself and contextual metadata. It’s used by: - Individual Score predictions - Batch processing jobs - Evaluation runs - Dashboard API integrations

Attributes:

text: The content to classify. Can be a transcript, document, etc. metadata: Additional context like source, timestamps, or tracking IDs results: Optional list of previous classification results, used for:

  • Composite scores that depend on other scores

  • Multi-step classification pipelines

  • Score dependency resolution

Common usage: 1. Basic classification:

input = Score.Input(

text=”content to classify”, metadata={“source”: “phone_call”}

)

  1. With dependencies:
    input = Score.Input(

    text=”content to classify”, metadata={“source”: “phone_call”}, results=[{

    “name”: “PriorScore”, “result”: prior_score_result

    }]

    )

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

metadata: dict
model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

results: List[Any] | None
text: str
class Parameters(*, scorecard_name: str | None = None, name: str | None = None, id: str | int | None = None, key: str | None = None, dependencies: List[dict] | None = None, data: dict | None = None, number_of_classes: int | None = None, label_score_name: str | None = None, label_field: str | None = None)

Bases: BaseModel

Parameters required for scoring.

Attributes

datadict

Dictionary containing data-related parameters.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

classmethod convert_data_percentage(value)

Convert the percentage value in the data dictionary to a float.

Parameters

valuedict

Dictionary containing data-related parameters.

Returns

dict

Updated dictionary with the percentage value converted to float.

data: dict | None
dependencies: List[dict] | None
id: str | int | None
key: str | None
label_field: str | None
label_score_name: str | None
model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

name: str | None
number_of_classes: int | None
scorecard_name: str | None
class Result(*, parameters: Parameters, value: str | bool, explanation: str | None = None, confidence: float | None = None, metadata: dict = {}, error: str | None = None, code: str | None = None)

Bases: BaseModel

Standard output structure for all Score classifications in Plexus.

The Result class provides a consistent way to represent classification outcomes, supporting both simple yes/no results and complex multi-class classifications with explanations. It’s used throughout Plexus for: - Individual Score results - Batch processing outputs - Evaluation metrics - Dashboard result tracking

Attributes:

parameters: Configuration used for this classification value: The classification result (e.g., “Yes”/”No” or class label) explanation: Detailed explanation of why this result was chosen confidence: Confidence score for the classification (0.0 to 1.0) metadata: Additional context about the classification error: Optional error message if classification failed

The Result class provides helper methods for common operations: - is_yes(): Check if result is affirmative - is_no(): Check if result is negative - __eq__: Compare results (case-insensitive)

Common usage: 1. Basic classification with explanation:

result = Score.Result(

parameters=self.parameters, value=”Yes”, explanation=”Clear greeting found at beginning of transcript”, confidence=0.95

)

  1. Classification with metadata:
    result = Score.Result(

    parameters=self.parameters, value=”No”, explanation=”No greeting found in transcript”, confidence=0.88, metadata={“source”: “phone_call”}

    )

  2. Error case:
    result = Score.Result(

    parameters=self.parameters, value=”ERROR”, error=”API timeout”

    )

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

code: str | None
confidence: float | None
property confidence_from_metadata: float | None

Backwards compatibility: confidence from metadata

error: str | None
explanation: str | None
property explanation_from_metadata: str | None

Backwards compatibility: explanation from metadata

is_no()
is_yes()
metadata: dict
model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

parameters: Score.Parameters
value: str | bool
exception SkippedScoreException(score_name: str, reason: str)

Bases: Exception

Raised when a score is skipped due to dependency conditions not being met.

__init__(score_name: str, reason: str)
__init__(**parameters)

Initialize the Score instance with the given parameters.

Parameters

**parametersdict

Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError

If the provided parameters do not pass validation.

static apply_processors_to_text(text: str, processors_config: list) str

Apply a list of processors to text for production predictions.

This method applies the same processor pipeline used during training/evaluation to production prediction inputs. It creates a single-row DataFrame, applies all configured processors, and returns the processed text.

Args:

text: Input text to process processors_config: List of processor configurations, each with ‘class’ and optional ‘parameters’

Example: [{‘class’: ‘FilterCustomerOnlyProcessor’},

{‘class’: ‘RemoveSpeakerIdentifiersTranscriptFilter’}]

Returns:

Processed text after applying all processors

Example:
processors = [

{‘class’: ‘FilterCustomerOnlyProcessor’}, {‘class’: ‘RemoveSpeakerIdentifiersTranscriptFilter’}

] processed_text = Score.apply_processors_to_text(raw_text, processors)

evaluate_model()

Evaluate the model on the validation data.

Returns

dict

Dictionary containing evaluation metrics.

classmethod from_name(scorecard, score)
get_accumulated_costs()

Get the expenses that have been accumulated over all the computed elements.

Returns:

dict: Aggregated cost information with totals and components

get_label_score_name()

Determine the appropriate score name based on the parameters.

Returns

str

The determined score name.

property is_multi_class

Determine if the classification problem is multi-class.

This property checks the unique labels in the dataframe to determine if the problem is multi-class.

Returns

bool

True if the problem is multi-class, False otherwise.

is_relevant(text)

Determine if the given text is relevant using the predict method.

Parameters

textstr

The text to be classified.

Returns

bool

True if the text is classified as relevant, False otherwise.

classmethod load(scorecard_identifier: str, score_name: str, use_cache: bool = True, yaml_only: bool = False)

Load a single score configuration with configurable caching behavior.

Args:

scorecard_identifier: A string that identifies the scorecard (ID, name, key, or external ID) score_name: Name of the specific score to load use_cache: If True (default), cache API data to local YAML files. If False, don’t cache. yaml_only: If True, load only from local YAML files without API calls.

Returns:

Score: An initialized Score instance

Raises:

ValueError: If the score cannot be loaded

static log_validation_errors(error: ValidationError)

Log validation errors for the parameters.

Parameters

errorValidationError

The validation error object containing details about the validation failures.

model_directory_path()
property number_of_classes

Determine the number of classes for the classification problem.

This property checks the unique labels in the dataframe to determine the number of classes.

Returns

int

The number of unique classes.

abstractmethod predict(context, model_input: Input) Result | List[Result]

Make predictions on the input data.

Parameters

contextAny

Context for the prediction

model_inputScore.Input

The input data for making predictions.

Returns

Union[Score.Result, List[Score.Result]]

Either a single Score.Result or a list of Score.Results

predict_validation()

Predict on the validation set.

This method should be implemented by subclasses to provide the prediction logic on the validation set.

record_configuration(configuration)

Record the provided configuration dictionary as a JSON file in the appropriate report folder for this model.

Parameters

configurationdict

Dictionary containing the configuration to be recorded.

register_model()

Register the model with the model registry.

report_directory_path()
report_file_name(file_name)

Generate the full path for a report file within the report directory.

Calling this function will implicitly trigger the function to ensure that the report directory exists.

Parameters

file_namestr

The name of the report file.

Returns

str

The full path to the report file with spaces replaced by underscores.

save_model()

Save the model to the model registry.

setup_label_map(labels)

Set up a mapping from labels to integers.

Parameters

labelslist

List of unique labels.

train_model(X_train, y_train, X_val, y_val)

Train the XGBoost model with the specified positive class weight.

Parameters

X_trainnumpy.ndarray

Training data features.

y_trainnumpy.ndarray

Training data labels.

X_valnumpy.ndarray

Validation data features.

y_valnumpy.ndarray

Validation data labels.