Aldea Speech-to-Text Extractor

Extractor ID: stt-aldea

Overview

The Aldea speech-to-text extractor uses the Aldea Speech-to-Text API to transcribe audio files. The API supports pre-recorded audio via REST and returns Deepgram-compatible response shapes (channels, alternatives, transcript). You can use the optional deepgram-transform stage after stt-aldea to render words or utterances when timestamps or diarization are enabled.

Installation

Install the optional Aldea dependency (httpx):

pip install "biblicus[aldea]"

You’ll also need an Aldea API key (tokens start with org_).

Supported Media Types

audio/mpeg - MP3 audio
audio/mp4 - M4A audio
audio/wav - WAV audio
audio/webm - WebM audio
audio/flac - FLAC audio
audio/ogg - OGG audio
audio/* - Any audio format supported by Aldea (MP3, AAC, FLAC, WAV, OGG, WebM, Opus, M4A; max duration defaults to 10 minutes)

Only audio items are processed. Other media types are automatically skipped.

Configuration

Config Schema

class AldeaSpeechToTextExtractorConfig(BaseModel):
    language: Optional[str] = None   # BCP-47 (e.g. en-US, es)
    diarization: bool = False
    timestamps: bool = False

Configuration Options

Option	Type	Default	Description
`language`	str or null	`null`	Language code hint in BCP-47 format (e.g. `en-US`, `es`)
`diarization`	bool	`false`	Enable speaker diarization (requires word timestamps)
`timestamps`	bool	`false`	Include per-word timestamps in the response

Usage

Command Line

Basic Usage

# Configure API key
export ALDEA_API_KEY="your-key-here"

# Extract audio transcripts
biblicus extract my-corpus --extractor stt-aldea

Custom Configuration

# Enable language hint and timestamps
biblicus extract my-corpus --extractor stt-aldea \
  --config language=en-US,timestamps=true

# Enable speaker diarization
biblicus extract my-corpus --extractor stt-aldea \
  --config diarization=true

Configuration File

extractor_id: stt-aldea
config:
  language: null
  diarization: false
  timestamps: false

biblicus extract my-corpus --configuration configuration.yml

Python API

from biblicus import Corpus
from biblicus.extractors import get_extractor

corpus = Corpus.from_directory("my-corpus")
extractor = get_extractor("stt-aldea")
config = extractor.validate_config({})
# Then use in your extraction pipeline

Authentication

Set your Aldea API key via environment or user config. Environment takes precedence.

Environment Variable

export ALDEA_API_KEY="org_your_api_key_here"

Configuration File

Add to ~/.biblicus/config.yml or ./.biblicus/config.yml:

aldea:
  api_key: org_your_api_key_here

See User configuration for details.

Response and Metadata

The extractor stores the full Aldea API response in stage metadata under the key aldea. The structure matches the Pre-Recorded Audio API: metadata (request_id, duration, channels) and results.channels[].alternatives[].transcript (and optionally words when timestamps are enabled). You can use deepgram-transform in a pipeline after stt-aldea to render words or utterances from this metadata.

Error Handling

Missing optional dependency: Install with pip install "biblicus[aldea]".
Missing API key: Set ALDEA_API_KEY or configure aldea.api_key in user config.
HTTP errors: The extractor calls response.raise_for_status(); non-2xx responses surface as exceptions.

Aldea Speech-to-Text Extractor

Overview

Installation

Supported Media Types

Configuration

Config Schema

Configuration Options

Usage

Command Line

Basic Usage

Custom Configuration

Configuration File

Python API

Authentication

Environment Variable

Configuration File

Response and Metadata

Error Handling

See Also