Aldea Speech-to-Text Extractor

Extractor ID: stt-aldea

Category: Speech-to-Text Extractors

Overview

The Aldea speech-to-text extractor uses the Aldea Speech-to-Text API to transcribe audio files. The API supports pre-recorded audio via REST and returns Deepgram-compatible response shapes (channels, alternatives, transcript). You can use the optional deepgram-transform stage after stt-aldea to render words or utterances when timestamps or diarization are enabled.

Installation

Install the optional Aldea dependency (httpx):

pip install "biblicus[aldea]"

You’ll also need an Aldea API key (tokens start with org_).

Supported Media Types

  • audio/mpeg - MP3 audio

  • audio/mp4 - M4A audio

  • audio/wav - WAV audio

  • audio/webm - WebM audio

  • audio/flac - FLAC audio

  • audio/ogg - OGG audio

  • audio/* - Any audio format supported by Aldea (MP3, AAC, FLAC, WAV, OGG, WebM, Opus, M4A; max duration defaults to 10 minutes)

Only audio items are processed. Other media types are automatically skipped.

Configuration

Config Schema

class AldeaSpeechToTextExtractorConfig(BaseModel):
    language: Optional[str] = None   # BCP-47 (e.g. en-US, es)
    diarization: bool = False
    timestamps: bool = False

Configuration Options

Option

Type

Default

Description

language

str or null

null

Language code hint in BCP-47 format (e.g. en-US, es)

diarization

bool

false

Enable speaker diarization (requires word timestamps)

timestamps

bool

false

Include per-word timestamps in the response

Usage

Command Line

Basic Usage

# Configure API key
export ALDEA_API_KEY="your-key-here"

# Extract audio transcripts
biblicus extract my-corpus --extractor stt-aldea

Custom Configuration

# Enable language hint and timestamps
biblicus extract my-corpus --extractor stt-aldea \
  --config language=en-US,timestamps=true

# Enable speaker diarization
biblicus extract my-corpus --extractor stt-aldea \
  --config diarization=true

Configuration File

extractor_id: stt-aldea
config:
  language: null
  diarization: false
  timestamps: false
biblicus extract my-corpus --configuration configuration.yml

Python API

from biblicus import Corpus
from biblicus.extractors import get_extractor

corpus = Corpus.from_directory("my-corpus")
extractor = get_extractor("stt-aldea")
config = extractor.validate_config({})
# Then use in your extraction pipeline

Authentication

Set your Aldea API key via environment or user config. Environment takes precedence.

Environment Variable

export ALDEA_API_KEY="org_your_api_key_here"

Configuration File

Add to ~/.biblicus/config.yml or ./.biblicus/config.yml:

aldea:
  api_key: org_your_api_key_here

See User configuration for details.

Response and Metadata

The extractor stores the full Aldea API response in stage metadata under the key aldea. The structure matches the Pre-Recorded Audio API: metadata (request_id, duration, channels) and results.channels[].alternatives[].transcript (and optionally words when timestamps are enabled). You can use deepgram-transform in a pipeline after stt-aldea to render words or utterances from this metadata.

Error Handling

  • Missing optional dependency: Install with pip install "biblicus[aldea]".

  • Missing API key: Set ALDEA_API_KEY or configure aldea.api_key in user config.

  • HTTP errors: The extractor calls response.raise_for_status(); non-2xx responses surface as exceptions.

See Also