Retrieval
Biblicus treats retrieval as a reproducible, explicit pipeline stage that transforms a corpus into structured evidence. Retrieval is separated from extraction and context shaping so each can be evaluated independently and swapped without rewriting ingestion.
Retrieval concepts
Backend: a pluggable retrieval implementation that can build and query runs.
Run: a recorded retrieval build for a corpus and extraction snapshot.
Evidence: structured output containing identifiers, provenance, and scores.
Stage: explicit stages such as retrieve, rerank, and filter.
How retrieval snapshots work
Ingest raw items into a corpus.
Build an extraction snapshot to produce text artifacts.
Build a retrieval snapshot with a backend, referencing the extraction snapshot.
Query the run to return evidence.
Retrieval runs are stored under:
retrieval/<backend_id>/<snapshot_id>/
A minimal run you can execute
This walkthrough uses the full text search backend and produces evidence you can inspect immediately.
rm -rf corpora/retrieval_demo
python -m biblicus init corpora/retrieval_demo
printf "alpha beta\n" > /tmp/retrieval-alpha.txt
printf "beta gamma\n" > /tmp/retrieval-beta.txt
python -m biblicus ingest --corpus corpora/retrieval_demo /tmp/retrieval-alpha.txt
python -m biblicus ingest --corpus corpora/retrieval_demo /tmp/retrieval-beta.txt
python -m biblicus extract build --corpus corpora/retrieval_demo --stage pass-through-text
python -m biblicus build --corpus corpora/retrieval_demo --backend sqlite-full-text-search
python -m biblicus query --corpus corpora/retrieval_demo --query "beta"
The query output is structured evidence with identifiers and scores. That evidence is the primary output for evaluation and downstream context packing.
Backends
See docs/backends/index.md for backend selection and configuration.
Choosing a backend
Start with the simplest backend that answers your question:
scanfor tiny corpora or sanity checks.sqlite-full-text-searchfor a practical lexical baseline.tf-vectorwhen you want deterministic term-frequency similarity without external dependencies.embedding-index-filewhen you want embedding retrieval with a local, file-backed index.
You can compare them with the same dataset and budget using the retrieval evaluation workflow.
Evaluation
Retrieval runs are evaluated against datasets with explicit budgets. See docs/retrieval-evaluation.md for the
dataset format and workflow, docs/feature-index.md for the behavior specifications, and docs/context-pack.md for
how evidence feeds into context packs.
Evidence inspection workflow
When you want to understand a result end to end:
Query the backend and save the output.
Inspect the top evidence items and their scores.
Trace each evidence
item_idback to the corpus for context.
Example:
python -m biblicus query --corpus corpora/demo --query "beta" > /tmp/retrieval_output.json
python -c "import json; data=json.load(open('/tmp/retrieval_output.json')); print(data['evidence'][:2])"
python -m biblicus show --corpus corpora/demo ITEM_ID
Saving evidence for later analysis
Evidence output is stable JSON. Save it alongside your experiments so you can compare runs later:
python -m biblicus query --corpus corpora/demo --query "beta" > artifacts/retrieval/beta.json
Record the snapshot identifier and budget values in the same folder so you can reproduce the query.
Labs and demos
When you want a repeatable example with bundled data, use the retrieval evaluation lab:
python scripts/retrieval_evaluation_lab.py --corpus corpora/retrieval_eval_lab --force
The lab builds a tiny corpus, runs extraction, builds a retrieval snapshot, and evaluates it. It prints the dataset path and evaluation output so you can open the JavaScript Object Notation directly.
Reproducibility checklist
Use these habits when you want repeatable retrieval experiments:
Record the extraction snapshot identifier and pass it explicitly when you build a retrieval snapshot.
Keep evaluation datasets in source control and treat them as immutable inputs.
Capture the full retrieval snapshot identifier when you compare outputs across backends.
Why the separation matters
Keeping extraction and retrieval distinct makes it possible to:
Reuse the same extracted artifacts across many retrieval backends.
Compare backends against the same corpus and dataset inputs.
Record and audit retrieval decisions without mixing in prompting or context formatting.
Retrieval quality
For retrieval quality upgrades, see docs/retrieval-quality.md.