# Aldea Speech-to-Text Extractor

**Extractor ID:** `stt-aldea`

**Category:** [Speech-to-Text Extractors](index.md)

## Overview

The Aldea speech-to-text extractor uses the [Aldea Speech-to-Text API](https://platform.aldea.ai/docs) to transcribe audio files. The API supports pre-recorded audio via REST and returns Deepgram-compatible response shapes (channels, alternatives, transcript). You can use the optional [deepgram-transform](deepgram-transform.md) stage after `stt-aldea` to render words or utterances when timestamps or diarization are enabled.

## Installation

Install the optional Aldea dependency (httpx):

```bash
pip install "biblicus[aldea]"
```

You'll also need an Aldea API key (tokens start with `org_`).

## Supported Media Types

- `audio/mpeg` - MP3 audio
- `audio/mp4` - M4A audio
- `audio/wav` - WAV audio
- `audio/webm` - WebM audio
- `audio/flac` - FLAC audio
- `audio/ogg` - OGG audio
- `audio/*` - Any audio format supported by Aldea (MP3, AAC, FLAC, WAV, OGG, WebM, Opus, M4A; max duration defaults to 10 minutes)

Only audio items are processed. Other media types are automatically skipped.

## Configuration

### Config Schema

```python
class AldeaSpeechToTextExtractorConfig(BaseModel):
    language: Optional[str] = None   # BCP-47 (e.g. en-US, es)
    diarization: bool = False
    timestamps: bool = False
```

### Configuration Options

| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `language` | str or null | `null` | Language code hint in BCP-47 format (e.g. `en-US`, `es`) |
| `diarization` | bool | `false` | Enable speaker diarization (requires word timestamps) |
| `timestamps` | bool | `false` | Include per-word timestamps in the response |

## Usage

### Command Line

#### Basic Usage

```bash
# Configure API key
export ALDEA_API_KEY="your-key-here"

# Extract audio transcripts
biblicus extract my-corpus --extractor stt-aldea
```

#### Custom Configuration

```bash
# Enable language hint and timestamps
biblicus extract my-corpus --extractor stt-aldea \
  --config language=en-US,timestamps=true

# Enable speaker diarization
biblicus extract my-corpus --extractor stt-aldea \
  --config diarization=true
```

#### Configuration File

```yaml
extractor_id: stt-aldea
config:
  language: null
  diarization: false
  timestamps: false
```

```bash
biblicus extract my-corpus --configuration configuration.yml
```

### Python API

```python
from biblicus import Corpus
from biblicus.extractors import get_extractor

corpus = Corpus.from_directory("my-corpus")
extractor = get_extractor("stt-aldea")
config = extractor.validate_config({})
# Then use in your extraction pipeline
```

## Authentication

Set your Aldea API key via environment or user config. Environment takes precedence.

### Environment Variable

```bash
export ALDEA_API_KEY="org_your_api_key_here"
```

### Configuration File

Add to `~/.biblicus/config.yml` or `./.biblicus/config.yml`:

```yaml
aldea:
  api_key: org_your_api_key_here
```

See [User configuration](../../user-configuration.md) for details.

## Response and Metadata

The extractor stores the full Aldea API response in stage metadata under the key `aldea`. The structure matches the [Pre-Recorded Audio API](https://platform.aldea.ai/docs/stt-api-reference/pre-recorded-audio): `metadata` (request_id, duration, channels) and `results.channels[].alternatives[].transcript` (and optionally `words` when timestamps are enabled). You can use [deepgram-transform](deepgram-transform.md) in a pipeline after `stt-aldea` to render words or utterances from this metadata.

## Error Handling

- **Missing optional dependency**: Install with `pip install "biblicus[aldea]"`.
- **Missing API key**: Set `ALDEA_API_KEY` or configure `aldea.api_key` in user config.
- **HTTP errors**: The extractor calls `response.raise_for_status()`; non-2xx responses surface as exceptions.

## See Also

- [Aldea STT API Documentation](https://platform.aldea.ai/docs)
- [Pre-Recorded Audio API Reference](https://platform.aldea.ai/docs/stt-api-reference/pre-recorded-audio)
- [Authentication](https://platform.aldea.ai/docs/authentication)
- [stt-deepgram](deepgram.md) - Deepgram Nova-3 extractor
- [stt-openai](openai.md) - OpenAI Whisper extractor
- [deepgram-transform](deepgram-transform.md) - Render Deepgram-shaped metadata (e.g. from Aldea) into text