# Context packs A context pack is the text that your application sends to a large language model. Biblicus keeps two things separate: - Retrieval returns **evidence** as structured objects with provenance. - Context pack building turns evidence into **context pack text** using an explicit policy. This separation makes retrieval repeatable and testable, while keeping context formatting as an explicit surface you can change, compare, and evaluate. ## Minimal policy The minimal policy is: join evidence text blocks with a separator. In Python: ```python from biblicus.context import ContextPackPolicy, build_context_pack policy = ContextPackPolicy(join_with="\n\n") context_pack = build_context_pack(result, policy=policy) print(context_pack.text) ``` ### Output structure Context pack building returns a structured result you can inspect: ```json { "text": "item_id: ...", "evidence_count": 2, "blocks": [ { "evidence_item_id": "ITEM_ID", "text": "item_id: ITEM_ID\nsource_uri: ...", "metadata": { "item_id": "ITEM_ID", "source_uri": "file:///...", "score": 0.42, "stage": "retrieve" } } ] } ``` `blocks` keeps a per-evidence record so you can debug how the final text was assembled. ### Before and after example Given two evidence blocks, compare how different policies change the output: ```python policy = ContextPackPolicy(join_with="\n\n", ordering="rank", include_metadata=False) context_pack = build_context_pack(result, policy=policy) print(context_pack.text) ``` With metadata enabled and score ordering: ```python policy = ContextPackPolicy(join_with="\n\n", ordering="score", include_metadata=True) context_pack = build_context_pack(result, policy=policy) print(context_pack.text) ``` The first output keeps the original ranking and clean text blocks. The second output reorders by score and adds explicit metadata lines for inspection. ## Policy surfaces Context pack policies make ordering and formatting explicit. ### Ordering Use `ordering` to control how evidence blocks are arranged before joining: - `rank`: use the evidence rank as provided by retrieval. - `score`: sort by score (descending) and then item identifier. - `source`: group by source uniform resource identifier, then sort by score. ### Metadata inclusion Set `include_metadata=True` to prepend metadata to each block. Metadata includes: - `item_id` - `source_uri` - `score` - `stage` ### Character budgets Character budgets drop trailing blocks until the context pack fits the specified limit. This keeps context shaping deterministic without relying on a tokenizer. In Python: ```python from biblicus.context import CharacterBudget, ContextPackPolicy, fit_context_pack_to_character_budget policy = ContextPackPolicy(join_with="\n\n", ordering="score", include_metadata=True) fitted = fit_context_pack_to_character_budget(context_pack, policy=policy, character_budget=CharacterBudget(max_characters=500)) print(fitted.text) ``` ## Command-line interface The command-line interface can build a context pack from a retrieval result by reading JavaScript Object Notation from standard input. ```bash biblicus query --corpus corpora/example --query "primary button style preference" \\ | biblicus context-pack build --ordering score --include-metadata --max-characters 500 ``` ## Reproducibility checklist - Keep the retrieval result JSON alongside the context pack output. - Record the policy values (`join_with`, `ordering`, `include_metadata`). - Record any budget inputs that trimmed the context pack. ## What context pack building does - Includes only usable text evidence. - Excludes evidence with no text payload or whitespace-only text. ## Common pitfalls - Building context packs from different retrieval snapshots while comparing the results. - Comparing outputs with different `ordering` or `include_metadata` values. - Relying on token counts without recording the tokenizer identifier. ## Token budgets Fitting context to a token budget is a separate concern. Token counting depends on a specific tokenizer and may vary by model. Biblicus treats token budgeting as a separate stage so it can be configured, tested, and evaluated independently from retrieval and text formatting. In Python: ```python from biblicus.context import ( ContextPackPolicy, TokenBudget, fit_context_pack_to_token_budget, ) fitted_context_pack = fit_context_pack_to_token_budget( context_pack, policy=policy, token_budget=TokenBudget(max_tokens=500), ) print(fitted_context_pack.text) ```