plexus.data.HuggingFaceDataCache module
- class plexus.data.HuggingFaceDataCache.HuggingFaceDataCache(**parameters)
Bases:
DataCache
A class to load and cache datasets from Hugging Face.
Initialize the DataCache instance with the given parameters.
Parameters
- **parametersdict
Arbitrary keyword arguments that are used to initialize the Parameters instance.
Raises
- ValidationError
If the provided parameters do not pass validation.
- class Parameters(*, class_name: str = 'DataCache', name: str)
Bases:
Parameters
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- name: str
- __init__(**parameters)
Initialize the DataCache instance with the given parameters.
Parameters
- **parametersdict
Arbitrary keyword arguments that are used to initialize the Parameters instance.
Raises
- ValidationError
If the provided parameters do not pass validation.
- analyze_dataset(df: DataFrame)
Display basic analysis of the loaded DataFrame.
- load_dataframe(*args, **kwargs)
Load a dataframe based on the provided parameters.
Returns
- pd.DataFrame
The loaded dataframe.
This method must be implemented by all subclasses.
- verify_dataset(df: DataFrame)
Perform basic verification checks on the loaded DataFrame and return results in a more readable format and a dictionary for testing.