basemode¶

basemode makes chat-tuned LLMs behave like continuation engines.

Most modern models want to answer prompts. basemode does the opposite: it coerces models to continue raw text naturally, with no assistant preamble.

What it does¶

Auto-selects a continuation strategy per model (completion, prefill, system, few_shot, fim)
Streams text token-by-token from CLI or Python
Supports parallel branching (-n/--branches)
Normalizes model names across providers (claude-*, gemini-*, etc.)
Includes usage and cost estimates using LiteLLM metadata

Interfaces¶

Interface	Use case
CLI Reference	Terminal usage, streaming output, branch generation
Python API	Integration into applications and scripts
Keys and Defaults	Manage API keys and preferred model

Quick example¶

basemode "The ship rounded the headland and"

# Parallel branches
basemode "The ship rounded the headland and" -n 4

# Inspect strategy + pricing metadata
basemode info claude-sonnet-4-6

See Quickstart for a 5-minute walkthrough.