plexus.data.DataCache module
- class plexus.data.DataCache.DataCache(**parameters)
Bases:
ABCA data cache is responsible for loading data from a source and caching it locally. This is an abstract base class that defines the interface and the parameter validation schema. Subclasses are responsible for implementing the actual data loading logic. Most subclasses will also need to extend the Parameters class to define any necessary parameters for getting the data.
Initialize the DataCache instance with the given parameters.
Parameters
- **parametersdict
Arbitrary keyword arguments that are used to initialize the Parameters instance.
Raises
- ValidationError
If the provided parameters do not pass validation.
- class Parameters(*, class_name: str = 'DataCache')
Bases:
BaseModelParameters for data caching. Override this class to define any necessary parameters for getting the data.
Attributes
- classstr
The name of the data cache class.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- class_name: str
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- __init__(**parameters)
Initialize the DataCache instance with the given parameters.
Parameters
- **parametersdict
Arbitrary keyword arguments that are used to initialize the Parameters instance.
Raises
- ValidationError
If the provided parameters do not pass validation.
- debug_dataframe(df, context='DATAFRAME', logger=None)
Essential debug logging for dataframes.
- Args:
df: pandas DataFrame to analyze context: String identifier for the context (used in log messages) logger: Logger instance to use (defaults to module logger)
- static format_value_for_display(value, column_name=None)
Format a value for display in logs and debug output.
- Args:
value: The value to format column_name: Optional column name (unused but kept for API compatibility)
- Returns:
str: Formatted string representation of the value
- abstractmethod load_dataframe(*args, **kwargs)
Load a dataframe based on the provided parameters.
Returns
- pd.DataFrame
The loaded dataframe.
This method must be implemented by all subclasses.
- log_validation_errors()
Log validation errors for the parameters.
Parameters
- errorValidationError
The validation error object containing details about the validation failures.
- upsert_item_for_dataset_row(dashboard_client, account_id, item_data, identifiers_dict, external_id=None, score_id=None)
Common method to upsert an Item for a dataset row. This centralizes Item creation logic that was duplicated across subclasses.
- Args:
dashboard_client: PlexusDashboardClient instance account_id: The account ID item_data: Dict or object containing item information (id, description, etc.) identifiers_dict: Dict of identifier key-value pairs for the item external_id: Optional external ID for the item (defaults to item_data.id) score_id: Optional score ID to associate with the Item
- Returns:
Tuple[str, bool, str]: (item_id, was_created, error_msg)