plexus.input_sources.InputSource module

class plexus.input_sources.InputSource.InputSource(pattern: str = None, **options)

Bases: ABC

Base class for input sources that extract text from various sources. Input sources run BEFORE processors in the pipeline.

Args:

pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options

__init__(pattern: str = None, **options)
Args:

pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options

abstractmethod extract(item, default_text: str) str

Extract text from the specified source.

Args:

item: Item object (may have attachedFiles) default_text: Fallback text from item.text

Returns:

Extracted text string

find_matching_attachment(item) str | None

Find first attachment matching the regex pattern.

Args:

item: Item object with attachedFiles list

Returns:

S3 key path of matching attachment, or None