Field guide

Token budget anatomy: dissecting an 8000-token lorebook

The token_budget field on a lorebook isn't a guideline — it's a ceiling the runtime enforces by silently dropping entries. Knowing the formula lets you author for it instead of against it.

What token_budget actually does

When a turn begins, the runtime scans your conversation window for keyword matches, collects every triggered entry's content, and adds them to the prompt — until the total reaches token_budget. Entries past the ceiling are dropped, lowest priority first. The user never sees a warning; the missing lore is invisible.

An 8000-token budget sounds generous until you realize a single richly-written location entry can be 600 tokens, a faction primer 800, an NPC dossier 1200. Three entries fire on one turn and you've already burned a third.

Rune count, not character count

The lint engine's budget rule uses a rune-count heuristic divided by tokens-per-rune:

Latin-script entries: ≈ 4 runes per token. A 4000-rune entry is about 1000 tokens.
CJK-dominant entries (> 30% wide-script): ≈ 1.5 runes per token. The same character count costs roughly 2.7× more tokens. A 4000-rune Chinese entry is ~2666 tokens.

This matters because the in-app rune counts can read identically while one card sails under budget and the other slams into the ceiling. Bilingual cards routinely hit this.

The three categories

Entries fall into three slots that interact differently with the budget:

Constant: always added to context every turn, regardless of keywords. These are your foundation — world rules, persona anchors. They consume budget up front; everything else fights over the remainder.
Selective (keyword-triggered): only added when keys match the scan window. Most flexible, most lossy when budget runs short — sorted by priority, then dropped tail-first.
Recursive: activated by another entry's content rather than your messages. Powerful for chained lore, but every recursion round consumes budget; a 6-deep chain can quietly chain 1200 tokens.

A worked example: 8000-token budget

Assume 30 entries: 5 constant (anchors), 20 selective (NPCs, locations, factions), 5 recursive (deeper lore). Average constant size 200 tokens, selective 350 tokens, recursive 500.

constant total: 5 × 200 = 1000 tokens (always-on floor)
if 6 selectives fire: 6 × 350 = 2100 tokens
if 2 recursive fire from those: 2 × 500 = 1000 tokens
turn cost: 1000 + 2100 + 1000 = 4100 tokens (under 8000)

Headroom: 3900 tokens. That's your safety margin for assistant response + system prompt + persona block. Cut it too close and the model runs out of generation budget before it can finish a paragraph.

Pre-flight your budget

Reading the budget number on a card tells you nothing about how it'll actually spend. Drop your card and a sample conversation into the lorebook debugger — every entry shows its estimated token cost next to its activation reason, and the debugger sums what would fire per turn. Run it on a CJK conversation and an English one separately; the totals will differ even with identical text length.

For a static, share-friendly version of the same numbers (lint findings on budget, content length, recursion cycles), use /audit.