plexus.input_sources.DeepgramInputSource module

class plexus.input_sources.DeepgramInputSource.DeepgramInputSource(pattern: str = None, **options)

Bases: TextFileInputSource

Extracts and formats text from Deepgram JSON transcription files. Supports multiple output formats: paragraphs, utterances, words, raw.

Args:

pattern: Regex pattern to match attachments (used by file-based sources) **options: Additional source-specific options

extract(item, default_text: str) str

Parse Deepgram JSON and format transcript.

Options:

format: “paragraphs” (default), “utterances”, “words”, “raw” include_timestamps: bool (default False) speaker_labels: bool (default False) time_range_start: float (default 0.0) - Start time in seconds time_range_duration: float or None (default None) - Duration in seconds, None = no end limit

Returns:

Formatted transcript text

Raises:

ValueError: If no matching attachment, invalid format, or invalid time range parameters KeyError: If Deepgram JSON structure is invalid Exception: If file download or parsing fails