plexus.input_sources package
- class plexus.input_sources.DeepgramInputSource(pattern: str = None, **options)
Bases:
TextFileInputSourceLoader-only input source for Deepgram JSON attachments.
Responsibilities: - find the Deepgram attachment - download/parse Deepgram JSON - return baseline transcript text plus raw deepgram metadata
Formatting and slicing are intentionally handled by processors.
- Args:
pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options
- extract(item) Score.Input
Find and return Score.Input with text from matching attachment.
- Args:
item: Item with attachedFiles
- Returns:
Score.Input with text content from file
- Raises:
ValueError: If no matching attachment found Exception: If file download or parsing fails
- class plexus.input_sources.InputSource(pattern: str = None, **options)
Bases:
ABCBase class for input sources that extract text from various sources. Input sources run BEFORE processors in the pipeline.
- Args:
pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options
- __init__(pattern: str = None, **options)
- Args:
pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options
- abstractmethod extract(item) Score.Input
Extract Score.Input from the specified source.
This method is the core of the input source pipeline. It takes an Item and produces a Score.Input with text and metadata populated.
- Args:
item: Item object (may have attachedFiles, text, metadata)
- Returns:
Score.Input with text field and metadata populated
- Example:
- class MyInputSource(InputSource):
- def extract(self, item):
# Extract text from source text = self.get_text_from_source(item)
# Build metadata metadata = item.metadata or {} metadata[‘source’] = ‘MyInputSource’
# Return Score.Input from plexus.scores.Score import Score return Score.Input(text=text, metadata=metadata)
- find_matching_attachment(item) str | None
Find first attachment matching the regex pattern.
- Args:
item: Item object with attachedFiles list
- Returns:
S3 key path of matching attachment, or None
- class plexus.input_sources.InputSourceFactory
Bases:
objectFactory for creating input source instances from class names. Mirrors ProcessorFactory pattern for consistency.
- class plexus.input_sources.TextFileInputSource(pattern: str = None, **options)
Bases:
InputSourceExtracts raw text from a file attachment matching a pattern.
- Args:
pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options
- extract(item) Score.Input
Find and return Score.Input with text from matching attachment.
- Args:
item: Item with attachedFiles
- Returns:
Score.Input with text content from file
- Raises:
ValueError: If no matching attachment found Exception: If file download or parsing fails