plexus.scores.FastTextClassifier module

class plexus.scores.FastTextClassifier.FastTextClassifier(**parameters)

Bases: Score

Initialize the Score instance with the given parameters.

Parameters

**parametersdict: Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError: If the provided parameters do not pass validation.

class Parameters(*, scorecard_name: str | None = None, name: str | None = None, id: str | int | None = None, key: str | None = None, dependencies: List[dict] | None = None, data: dict | None = None, number_of_classes: int | None = None, label_score_name: str | None = None, label_field: str | None = None, learning_rate: float = 0.1, dimension: int = 100, window_size: int = 5, number_of_epochs: int = 5, minimum_word_count: int = 1, minimum_label_count: int = 1, minimum_character_ngram_length: int = 0, maximum_character_ngram_length: int = 0, number_of_negative_samples: int = 5, word_ngram_count: int = 1, loss_function: str = 'softmax', bucket_size: int = 2000000, number_of_threads: int = 4, learning_rate_update_rate: int = 100, sampling_threshold: float = 0.0001)

Bases: Parameters

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

bucket_size: int

dimension: int

learning_rate: float

learning_rate_update_rate: int

loss_function: str

maximum_character_ngram_length: int

minimum_character_ngram_length: int

minimum_label_count: int

minimum_word_count: int

model_config: ClassVar[ConfigDict] = {'protected_namespaces': ()}: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

number_of_epochs: int

number_of_negative_samples: int

number_of_threads: int

sampling_threshold: float

window_size: int

word_ngram_count: int

__init__(**parameters)

Initialize the Score instance with the given parameters.

Parameters

**parametersdict: Arbitrary keyword arguments that are used to initialize the Parameters instance.

Raises

ValidationError: If the provided parameters do not pass validation.

data_filename()

get_model_artifact_path()

load_context(context)

Load the trained model and any necessary artifacts based on the MLflow context.

Parameters

contextmlflow.pyfunc.PythonModelContext: The context object containing artifacts and other information.

load_model(model_path)

predict(model_input, text_column='text')

Make predictions on the input data.

Parameters

contextAny: Context for the prediction (e.g., MLflow context)
model_inputScore.Input: The input data for making predictions.

Returns

Union[Score.Result, List[Score.Result]]: Either a single Score.Result or a list of Score.Results

predict_validation()

Predict on the validation set.

This method should be implemented by subclasses to provide the prediction logic on the validation set.

process_data(): Handle any pre-processing of the training data, including the training/validation splits.

register_model(): Register the model with the model registry.

save_model(): Save the model to the model registry.

save_model_binary()

train_model()

Train the XGBoost model with the specified positive class weight.

Parameters

X_trainnumpy.ndarray: Training data features.
y_trainnumpy.ndarray: Training data labels.
X_valnumpy.ndarray: Validation data features.
y_valnumpy.ndarray: Validation data labels.