API Reference

Main Functions

chatter

def chatter(
    multipart_prompt: str,
    context: Union[dict, List[dict]] = {},
    *,
    model: LLM = None,
    credentials: Optional[LLMCredentials] = None,
    extra_kwargs=None,
    template_path: Optional[Path] = None,
    include_paths: Optional[List[Path]] = None,
    strict_undefined: bool = False,
    max_concurrent: Optional[int] = None,
    on_complete: Optional[callable] = None,
) -> Union[ChatterResult, List[ChatterResult]]

Process a struckdown template with one or more contexts.

Parameters:

Parameter	Type	Description
`multipart_prompt`	`str`	Struckdown template string
`context`	`dict` or `List[dict]`	Variables for template rendering
`model`	`LLM`	Model configuration (default from env)
`credentials`	`LLMCredentials`	API credentials (default from env)
`extra_kwargs`	`dict`	Additional LLM parameters
`template_path`	`Path`	Path for resolving includes
`include_paths`	`List[Path]`	Additional include search paths
`strict_undefined`	`bool`	Raise on undefined variables
`max_concurrent`	`int`	Max concurrent requests (list mode)
`on_complete`	`callable`	Callback after each completion

Returns: ChatterResult for single context, List[ChatterResult] for list.

Example:

from struckdown import chatter

result = chatter("Tell me a joke [[joke]]")
print(result["joke"])

chatter_async

async def chatter_async(
    multipart_prompt: str,
    context: Union[dict, List[dict]] = {},
    **kwargs
) -> Union[ChatterResult, List[ChatterResult]]

Async version of chatter(). Same parameters.

Example:

import asyncio
from struckdown import chatter_async

async def main():
    result = await chatter_async("Tell me a joke [[joke]]")
    print(result["joke"])

asyncio.run(main())

get_embedding

def get_embedding(
    texts: List[str],
    model: Optional[str] = None,
    credentials: Optional[LLMCredentials] = None,
    dimensions: Optional[int] = None,
    batch_size: int = 100,
    progress_callback: Optional[Callable[[int], None]] = None,
) -> EmbeddingResultList

Get embeddings for texts using API or local models.

Parameters:

Parameter	Type	Description
`texts`	`List[str]`	Texts to embed
`model`	`str`	Model name (e.g., `"text-embedding-3-small"`)
`credentials`	`LLMCredentials`	API credentials
`dimensions`	`int`	Output dimensions (model-specific)
`batch_size`	`int`	Texts per API batch
`progress_callback`	`Callable`	Called with count completed

Returns: EmbeddingResultList containing EmbeddingResult arrays.

Example:

from struckdown import get_embedding
import numpy as np

results = get_embedding(["hello", "world"])
similarity = np.dot(results[0], results[1])
print(f"Cost: ${results.total_cost}")

get_embedding_async

async def get_embedding_async(
    texts: List[str],
    **kwargs
) -> EmbeddingResultList

Async version of get_embedding(). Same parameters.

structured_chat

def structured_chat(
    prompt: str = None,
    messages: List[Dict] = None,
    return_type: BaseModel = None,
    llm: LLM = None,
    credentials: LLMCredentials = None,
    max_retries: int = 3,
    max_tokens: Optional[int] = None,
    extra_kwargs: Optional[dict] = None,
) -> Tuple[BaseModel, Box]

Low-level function for structured LLM calls with Pydantic models.

Returns: Tuple of (parsed response, completion object).

Result Classes

ChatterResult

Container for template processing results.

Properties:

Property	Type	Description
`results`	`Dict[str, SegmentResult]`	Results by slot name
`response`	`Any`	Last slot’s output
`outputs`	`Box`	All outputs as Box dict
`total_cost`	`float`	Total USD cost
`prompt_tokens`	`int`	Total input tokens
`completion_tokens`	`int`	Total output tokens
`total_tokens`	`int`	Total tokens
`has_unknown_costs`	`bool`	Any unknown costs
`fresh_call_count`	`int`	Fresh API calls
`cached_call_count`	`int`	Cache hits

Methods:

result["slot_name"]       # Get slot output
result.keys()             # List slot names
len(result)               # Number of slots

EmbeddingResult

Numpy array subclass with cost metadata.

Properties:

Property	Type	Description
`cost`	`Optional[float]`	USD cost (None if unknown)
`tokens`	`int`	Token count
`model`	`str`	Model name
`cached`	`bool`	From cache

Works as a normal numpy array:

import numpy as np
emb = results[0]
similarity = np.dot(emb, other_emb)

EmbeddingResultList

List of EmbeddingResult with aggregate properties.

Properties:

Property	Type	Description
`total_cost`	`Optional[float]`	Total cost (None if any unknown)
`total_tokens`	`int`	Total tokens
`cached_count`	`int`	Cached embeddings
`fresh_count`	`int`	Fresh embeddings
`fresh_cost`	`Optional[float]`	Cost from fresh only
`has_unknown_costs`	`bool`	Any unknown costs
`model`	`str`	Model name

CostSummary

Aggregate costs across multiple results.

from struckdown import CostSummary

summary = CostSummary.from_results([result1, result2])
print(summary)  # "Total cost: $0.0012 (5 calls, 2 cached)"

Properties:

Property	Type	Description
`total_cost`	`float`	Combined cost
`total_tokens`	`int`	Combined tokens
`prompt_tokens`	`int`	Combined input
`completion_tokens`	`int`	Combined output
`fresh_call_count`	`int`	Total fresh calls
`cached_call_count`	`int`	Total cache hits
`has_unknown_costs`	`bool`	Any unknown

Configuration Classes

LLM

Model configuration. Defaults are read from environment variables for CLI convenience but should be passed explicitly when using struckdown as a library.

from struckdown import LLM

llm = LLM(model_name="openai:gpt-4o")
result = chatter("...", model=llm)

Fields:

Field	Type	Default	Description
`model_name`	`str`	`DEFAULT_LLM` env var, or `gpt-4.1-mini`	Model identifier in `provider:model` format

LLMCredentials

API credentials for LLM calls. Defaults are read from environment variables for CLI convenience. When embedding struckdown in a web application, always pass credentials explicitly (e.g. from a database-stored Credential via ModelSpec.as_credentials()).

from struckdown import LLMCredentials

creds = LLMCredentials(
    api_key="sk-...",
    base_url="https://api.openai.com/v1"
)
result = chatter("...", credentials=creds)

Fields:

Field	Type	Default	Description
`api_key`	`str`	`LLM_API_KEY` env var	API key for the provider
`base_url`	`str`	`LLM_API_BASE` env var	Base URL (set for proxies, leave empty for direct provider access)

ModelSpec

Portable, self-contained specification for a model endpoint. Combines identity, credentials, pricing, and metadata. Preferred over passing LLM + LLMCredentials separately.

from struckdown.model_spec import ModelSpec

spec = ModelSpec(
    model_name="openai:gpt-4o",
    api_key="sk-...",
    input_cost_per_mtok=2.50,
    output_cost_per_mtok=10.0,
)

# Convert to LLM + credentials for chatter/structured_chat
llm = spec.as_llm()
credentials = spec.as_credentials()

Fields:

Field	Type	Default	Description
`model_name`	`str`	(required)	Model identifier in `provider:model` format
`model_type`	`"llm"` or `"embedding"`	`"llm"`	Model type
`api_key`	`SecretStr`	`None`	API key (masked in repr)
`base_url`	`str`	`None`	Base URL for proxies
`data_residency`	`str`	`None`	Data residency region
`display_name`	`str`	`None`	Human-readable name
`input_cost_per_mtok`	`float`	`None`	Input cost per million tokens (USD)
`output_cost_per_mtok`	`float`	`None`	Output cost per million tokens (USD)

Computed fields: provider (extracted from model_name), bare_name (model name without provider prefix), provider_display (human-readable provider name).

When pricing fields are set, struckdown uses them for cost calculation instead of looking up prices via pydantic-ai or genai-prices.

ModelRegistry

Collection of ModelSpec instances with alias resolution. Used by pipeline systems (e.g. soaking) to manage multiple models.

from struckdown.model_spec import ModelSpec, ModelRegistry

registry = ModelRegistry(
    models={
        "openai:gpt-4o": ModelSpec(model_name="openai:gpt-4o", api_key="sk-..."),
        "openai:gpt-4o-mini": ModelSpec(model_name="openai:gpt-4o-mini", api_key="sk-..."),
    },
    aliases={"default": "openai:gpt-4o-mini", "best": "openai:gpt-4o"},
    default_llm="openai:gpt-4o-mini",
)

spec = registry.resolve("best")  # returns the gpt-4o spec

Fields:

Field	Type	Default	Description
`models`	`Dict[str, ModelSpec]`	`{}`	Registered models keyed by model_name
`aliases`	`Dict[str, str]`	`{}`	Alias to model_name mappings
`default_llm`	`str`	`None`	Default LLM model name
`default_embedding`	`str`	`None`	Default embedding model name

Methods:

Method	Description
`resolve(name_or_alias)`	Resolve a name or alias to a `ModelSpec`
`resolve_embedding(name)`	Resolve an embedding model
`register(spec)`	Add a `ModelSpec` to the registry
`llms()`	List all LLM specs
`embeddings()`	List all embedding specs
`from_env()`	Build a minimal registry from `DEFAULT_LLM` / `LLM_API_KEY` / `LLM_API_BASE` env vars (CLI convenience)

Utility Functions

clear_cache

from struckdown import clear_cache
clear_cache()  # Clear LLM response cache

clear_embedding_cache

from struckdown.embedding_cache import clear_embedding_cache
clear_embedding_cache()  # Clear embedding cache

get_run_id / new_run

from struckdown import get_run_id, new_run

run_id = get_run_id()  # Current run ID
new_run()              # Start new run (for cache detection)

mark_struckdown_safe

from struckdown import mark_struckdown_safe

safe_content = mark_struckdown_safe("<system>...</system>")

Mark content as safe to bypass auto-escaping.