User configuration
Biblicus supports a small user configuration file for optional integrations.
This is separate from corpus configuration. A corpus is a folder you can copy and share. User configuration usually contains machine-specific settings such as credentials.
Where it looks
Biblicus looks for user configuration in two places, in this order.
~/.biblicus/config.yml./.biblicus/config.yml
If both files exist, the local configuration overrides the home configuration.
File format
The configuration file is YAML and is parsed using the dotyaml approach (YAML with optional environment variable interpolation).
Reproducibility notes
User configuration is machine-specific. Do not bake secrets into corpora or datasets, and avoid storing user config files in source control.
Example: OpenAI speech to text
Create a config file with an OpenAI API key.
You can start from the included example configuration file:
Copy
.biblicus/config.example.ymlto~/.biblicus/config.yml, orCopy
.biblicus/config.example.ymlto./.biblicus/config.yml
~/.biblicus/config.yml:
openai:
api_key: YOUR_KEY_HERE
The OpenAI speech to text extractor also supports the OPENAI_API_KEY environment variable. Environment takes precedence over configuration.
Example: Deepgram speech to text
Create a config file with a Deepgram API key.
~/.biblicus/config.yml:
deepgram:
api_key: YOUR_KEY_HERE
The Deepgram speech to text extractor also supports the DEEPGRAM_API_KEY environment variable. Environment takes precedence over configuration.
Example: Aldea speech to text
Create a config file with an Aldea API key.
~/.biblicus/config.yml:
aldea:
api_key: YOUR_KEY_HERE
The Aldea speech to text extractor also supports the ALDEA_API_KEY environment variable. Environment takes precedence over configuration.
Source profiles (remote collections and corpora)
Biblicus supports multiple source profiles so you can connect to many accounts and providers at once.
Profiles live in ~/.biblicus/config.yml or ./.biblicus/config.yml (repo root).
Example profiles:
sources:
- name: azure-prod
kind: azure-blob
connection_string: YOUR_CONNECTION_STRING
account_name: YOUR_ACCOUNT_NAME
- name: s3-archive
kind: s3
access_key_id: YOUR_KEY_ID
secret_access_key: YOUR_SECRET
session_token: YOUR_SESSION_TOKEN
region: us-east-1
endpoint_url: https://s3.us-east-1.amazonaws.com
Environment variables override configuration when present:
AZURE_STORAGE_CONNECTION_STRINGAZURE_STORAGE_ACCOUNTAZURE_STORAGE_KEYAWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYAWS_SESSION_TOKENAWS_REGION
Example: Neo4j graph extraction
Graph extraction uses a Neo4j backend. Biblicus can auto-start a local Neo4j Docker container if it is not already running.
Install the Neo4j Python driver before running graph extraction:
python -m pip install neo4j
If you use NLP-based graph extractors (for example ner-entities or dependency-relations), install the NLP model
package and the model data your configuration references.
~/.biblicus/config.yml:
neo4j:
uri: bolt://localhost:7687
username: neo4j
password: testpassword
auto_start: true
container_name: biblicus-neo4j
docker_image: neo4j:5
http_port: 7474
bolt_port: 7687
Environment variables override configuration when present:
NEO4J_URINEO4J_USERNAMENEO4J_PASSWORDNEO4J_DATABASEBIBLICUS_NEO4J_AUTO_STARTBIBLICUS_NEO4J_CONTAINER_NAMEBIBLICUS_NEO4J_IMAGEBIBLICUS_NEO4J_HTTP_PORTBIBLICUS_NEO4J_BOLT_PORT
Common pitfalls
Saving secrets in the corpus directory instead of a user config file.
Forgetting that local configuration overrides home configuration.
Expecting user configuration to be copied with a corpus.