Context Engine
The Context Engine is the Biblicus SDK for assembling elastic, budget-aware prompt contexts. It lets AI engineers describe what should be in an LLM request while Biblicus handles how to fit it into a budgeted context window.
It turns a high-level plan into:
a system prompt
a history message list
a user message
The Context Engine can compact content when it is too large and expand retriever packs by paginating with offset and limit.
“Context assembly is the most failure-prone part of agent engineering. Engineers need a reliable way to fit knowledge into limited context windows without hand-writing brittle logic.”
Why Context Engine?
Context Engine provides a first-class, testable, and reusable context assembly surface. It is designed to be the shared foundation for both Python applications and Tactus procedures.
Composable Context plans: Mix system/user messages, nested contexts, and retriever packs.
Budget-aware compaction: Use pluggable compactor strategies.
Budget-aware expansion: Use retriever pagination (
offset+limit) to fill remaining budget.Deterministic assembly: Produce a predictable message history for model calls.
Core Concepts
ContextDeclaration: The plan describing which messages and packs to assemble.
ContextPolicySpec: Budgeting, compaction, and expansion policy.
ContextAssembler: The assembly engine that renders a ContextDeclaration.
ContextRetrieverRequest: The request payload passed into retriever callables.
Basic Usage
from biblicus.context_engine import ContextAssembler, ContextDeclaration
context_spec = ContextDeclaration(
name="support",
messages=[
{"type": "system", "content": "You are a support agent."},
{"type": "user", "content": "Question: {input.question}"},
],
)
assembler = ContextAssembler({"support": context_spec})
result = assembler.assemble(
context_name="support",
base_system_prompt="",
history_messages=[],
user_message="",
template_context={"input": {"question": "Where is my order?"}, "context": {}},
)
print(result.system_prompt)
print(result.user_message)
Retriever Packs
Retriever packs are inserted with a context directive. You provide a retriever callable that
accepts a ContextRetrieverRequest and returns a ContextPack.
from biblicus.context import ContextPack, ContextPackBlock
from biblicus.context_engine import ContextAssembler, ContextDeclaration, ContextRetrieverRequest
def fake_retrieve(request: ContextRetrieverRequest) -> ContextPack:
text = "Evidence chunk"
return ContextPack(
text=text,
evidence_count=1,
blocks=[ContextPackBlock(evidence_item_id="demo-1", text=text, metadata=None)],
)
context_spec = ContextDeclaration(
name="support",
policy={
"input_budget": {"max_tokens": 10},
"pack_budget": {"default_ratio": 1.0},
},
messages=[
{"type": "system", "content": "Use this context."},
{"type": "context", "name": "kb_search"},
{"type": "user", "content": "Question"},
],
)
retriever_spec = type("RetrieverSpec", (), {"config": {"query": "demo", "limit": 3}})()
assembler = ContextAssembler(
{"support": context_spec},
retriever_registry={"kb_search": retriever_spec},
)
result = assembler.assemble(
context_name="support",
base_system_prompt="",
history_messages=[],
user_message="",
template_context={"input": {}, "context": {}},
retriever_override=fake_retrieve,
)
Expansion and Pagination
Enable expansion in the policy to paginate retrievers when packs are under budget.
policy = {
"input_budget": {"max_tokens": 40},
"pack_budget": {"default_ratio": 0.5},
"expansion": {"max_pages": 3, "min_fill_ratio": 1.0},
}
The Context Engine will issue additional retrieval requests with increasing offset until the
pack budget is satisfied or no more results are returned.
Compaction Strategies
When overflow is set to compact, the Context Engine compacts content with a pluggable compactor.
policy = {
"input_budget": {"max_tokens": 10},
"overflow": "compact",
"compactor": {"type": "truncate"},
}
Custom compactors can be registered via compactor_registry.
FAQ
What does “elastic” mean?
Elastic means the Context Engine can contract (compact) or expand (paginate) retrieval output depending on the current token budget. When a pack is too large it compacts; when it is too small and pagination is available, it can fetch additional pages.
How is pagination used?
Retrievers accept offset and limit. The Context Engine uses those to request additional pages until a target budget is met or no more results are available.
Does this replace Context packs?
No. Context packs are still derived from retrieval evidence. The Context Engine composes those packs into model messages and manages how they are sized and placed.