Field guide
Token budget anatomy: dissecting an 8000-token lorebook
The token_budget field on a lorebook isn't a guideline — it's a ceiling the runtime enforces by silently dropping entries. Knowing the formula lets you author for it instead of against it.
What token_budget actually does
When a turn begins, the runtime scans your conversation window for keyword matches, collects every triggered entry's content, and adds them to the prompt — until the total reaches token_budget. Entries past the ceiling are dropped, lowest priority first. The user never sees a warning; the missing lore is invisible.
An 8000-token budget sounds generous until you realize a single richly-written location entry can be 600 tokens, a faction primer 800, an NPC dossier 1200. Three entries fire on one turn and you've already burned a third.
Rune count, not character count
The lint engine's budget rule uses a rune-count heuristic divided by tokens-per-rune:
- Latin-script entries: ≈ 4 runes per token. A 4000-rune entry is about 1000 tokens.
- CJK-dominant entries (> 30% wide-script): ≈ 1.5 runes per token. The same character count costs roughly 2.7× more tokens. A 4000-rune Chinese entry is ~2666 tokens.
This matters because the in-app rune counts can read identically while one card sails under budget and the other slams into the ceiling. Bilingual cards routinely hit this.
The three categories
Entries fall into three slots that interact differently with the budget:
- Constant: always added to context every turn, regardless of keywords. These are your foundation — world rules, persona anchors. They consume budget up front; everything else fights over the remainder.
- Selective (keyword-triggered): only added when keys match the scan window. Most flexible, most lossy when budget runs short — sorted by priority, then dropped tail-first.
- Recursive: activated by another entry's content rather than your messages. Powerful for chained lore, but every recursion round consumes budget; a 6-deep chain can quietly chain 1200 tokens.
A worked example: 8000-token budget
Assume 30 entries: 5 constant (anchors), 20 selective (NPCs, locations, factions), 5 recursive (deeper lore). Average constant size 200 tokens, selective 350 tokens, recursive 500.
- constant total: 5 × 200 = 1000 tokens (always-on floor)
- if 6 selectives fire: 6 × 350 = 2100 tokens
- if 2 recursive fire from those: 2 × 500 = 1000 tokens
- turn cost: 1000 + 2100 + 1000 = 4100 tokens (under 8000)
Headroom: 3900 tokens. That's your safety margin for assistant response + system prompt + persona block. Cut it too close and the model runs out of generation budget before it can finish a paragraph.
Pre-flight your budget
Reading the budget number on a card tells you nothing about how it'll actually spend. Drop your card and a sample conversation into the lorebook debugger — every entry shows its estimated token cost next to its activation reason, and the debugger sums what would fire per turn. Run it on a CJK conversation and an English one separately; the totals will differ even with identical text length.
For a static, share-friendly version of the same numbers (lint findings on budget, content length, recursion cycles), use /audit.