PNG character card metadata explained: tEXt chunks, JSON, and broken imports.
A working creator’s tour through the PNG tEXt chunk — how the Tavern card format hides a JSON payload inside an ordinary image, why some hosts, image editors, and CDNs quietly throw it away, and how to verify, preserve, and rebuild what they break.
A character card is just a PNG
Open a Tavern character card in a hex viewer and the first eight bytes are 89 50 4E 47 0D 0A 1A 0A — the PNG signature. Open it in your browser and you see a portrait. Drag it into SillyTavern, RisuAI, or Chub, and somehow a full JSON document — name, description, greetings, lorebook, system prompt — appears alongside the picture. There is no sidecar file. The metadata is inside the PNG itself, in a place every PNG decoder is required to walk past without complaint.
That place is the tEXt chunk. Understanding what it is — and what kills it — is the difference between a card library that survives every export, upload, and CDN, and one that mysteriously arrives at a friend’s machine as a blank-faced portrait with no personality at all.
How a PNG is actually built
A PNG file is a fixed 8-byte signature followed by a stream of chunks. Each chunk has the same skeleton:
┌──────────┬──────────┬──────────────┬──────────┐ │ length │ type │ data │ CRC-32 │ │ 4 bytes │ 4 bytes │ length bytes │ 4 bytes │ └──────────┴──────────┴──────────────┴──────────┘
The four-byte type is what matters here. PNG uses the case of each letter to encode handling rules:
- First letter uppercase → critical chunk (e.g. IHDR, IDAT, IEND). Decoders must understand it.
- First letter lowercase → ancillary chunk. Decoders may skip it. Image data still renders fine.
- Third letter is reserved (always uppercase today).
- Fourth letter lowercase → safe to copy even by editors that don’t understand it; uppercase means the editor must drop the chunk if it modifies the image in ways that would invalidate it.
tEXt is lowercase-first (ancillary, decoders may skip) and lowercase-last (safe to copy). On paper, exactly the kind of chunk that should round-trip through any well-behaved tool. In practice, “well-behaved” turns out to be the load-bearing word.
tEXt, zTXt, iTXt — three flavours of text
PNG actually defines three textual chunk types. The community settled on the simplest one for character cards, but it is worth knowing why:
- tEXt — a Latin-1 keyword (1–79 bytes), a single null separator, and a Latin-1 text value. No compression, no language tag. The keyword for a Tavern V1/V2 card is literally chara.
- zTXt — identical layout, but the value is zlib-compressed. Useful for very large payloads, but adds a step every writer must implement correctly.
- iTXt — the modern one: UTF-8 text, optional compression, optional language tag and translated keyword. Strictly more capable than tEXt.
So why did character cards land on tEXt if iTXt is technically superior? Two reasons, both pragmatic. First, every PNG library has supported tEXt forever; some lighter or older toolchains still ship patchy iTXt support. Second, the community solved the Latin-1 problem by punting: instead of putting JSON directly in the value, the convention is to Base64-encode UTF-8 JSON and store the result as ASCII inside the tEXt value. ASCII is a subset of Latin-1, so the encoding is trivially safe regardless of how strictly a reader interprets the spec.
What Tavern cards actually write
Concretely, a Tavern character card writes one or two tEXt chunks, both before the final IEND:
- chara — used by V1 and V2 cards. The value is Base64 of UTF-8 JSON. For V2, the JSON has the spec: "chara_card_v2" envelope with everything real living under data.
- ccv3 — introduced for V3 cards. The value is Base64 of UTF-8 JSON with spec: "chara_card_v3". Cards that want maximum compatibility ship both: a chara chunk with a V2-shaped payload, and a sibling ccv3 chunk with the V3-shaped one.
A minimal decode is therefore: walk the chunks, find one whose type is tEXt with keyword ccv3 (preferred) or chara (fallback), Base64-decode the value, then UTF-8-decode and JSON.parse. No magic; no proprietary container; nothing a stock PNG decoder cannot reach.
Where the metadata goes missing
Because tEXt is an ancillary chunk, any tool that re-encodes the image is technically allowed to drop it. Many do. The most common ways a card arrives blank-faced at the destination:
- Image editor “Save As”. Photoshop, GIMP, Affinity, Pixelmator — each has its own opinion about ancillary chunks. The default settings usually preserve them, but “Export As PNG”, “Save for Web”, or any path through a stripping optimiser will silently drop tEXt.
- Browsers re-encoding on paste or download. Some browser pipelines that screenshot, canvas-export, or background-fetch images re-encode them through the platform’s PNG writer. Canvas-based round trips almost always lose tEXt because the canvas API has no way to carry it through.
- CDN image optimisation. Cloudflare Polish, Vercel Image Optimization, Cloudinary auto-format, and similar services treat ancillary chunks as dead weight. They re-compress aggressively and strip anything not strictly required to render. If your card art is served through /_next/image or behind a polished CDN, the file the visitor downloads is not the file you uploaded.
- WebP / AVIF conversion. WebP and AVIF have their own metadata containers (EXIF, XMP), but most converters do not bridge PNG tEXt into them. Any pipeline that auto-converts uploads to WebP/AVIF for delivery will silently destroy the embedded card JSON unless you opt out.
- Platform stripping. Some social platforms run uploaded images through a sanitiser to remove EXIF GPS and similar privacy leakage. Those sanitisers tend to be PNG-chunk-greedy and remove everything ancillary, not just EXIF. The card travels through Discord, Twitter, or a privacy proxy and arrives stripped.
- Resizing libraries that don’t copy ancillary chunks. Sharp, ImageMagick, and friends all can preserve tEXt but most pipelines do not enable it. A thumbnail generator that runs every uploaded image through Sharp with default settings will drop the metadata before it ever touches the CDN.
How to inspect a card
The fastest sanity check is pngcheck, a small CLI that has been the standard PNG validator for decades. With the -t flag it prints text chunks:
$ pngcheck -t card.png
OK: card.png (512x768, 32-bit RGB+alpha, non-interlaced, 87.4%).
chunk tEXt at offset 0x00021, length 4242, keyword: chara
chunk tEXt at offset 0x01a55, length 5180, keyword: ccv3If you see only the image dimensions and no text chunks, the metadata is gone. From there, programmatic inspection in Python with Pillow takes about five lines:
from PIL import Image
import base64, json
img = Image.open("card.png")
raw = img.text.get("ccv3") or img.text.get("chara")
card = json.loads(base64.b64decode(raw))
print(card["data"]["name"])Pillow exposes every tEXt / zTXt / iTXt chunk through the same img.text dict, keyed by the chunk’s keyword. If img.text is empty, the file has been laundered. If the Base64 decode raises, somebody re-encoded the value as raw UTF-8 instead of Base64 — a known-bad pattern from a handful of third-party tools.
How to keep the metadata alive
A short, opinionated checklist for shipping cards that survive the trip:
- Pick chunk-preserving tools. For resizing: Pillow with img.save(..., pnginfo=...); Sharp with .png({ force: true }).withMetadata() plus a manual chunk copy. For editors: prefer tools that advertise “preserve metadata” and verify with pngcheck -t after a round trip.
- Turn off CDN image optimisation for card files. On Cloudflare, exclude card paths from Polish or use cf-polish: off. On Vercel, serve cards via a route that bypasses /_next/image. Treat the canonical card file as a download artefact, not a rendered asset.
- Never serve cards as WebP/AVIF. Disable auto-format on Cloudinary, imgix, and similar. PNG is the wire format the spec assumes; anything else must be a knowing decision.
- Keep a JSON backup next to the PNG. The PNG is the friendly distribution format; the JSON is the source of truth. If a CDN, platform, or browser laundered the file, you can always rebuild the PNG by re-embedding the JSON. Without the JSON, a stripped card is gone.
- Verify after every pipeline change. A single pngcheck -t in CI on a known-good fixture catches an entire class of silent regressions that otherwise only show up when a user complains their import is empty.
One codex, no laundered cards
tavernai.cards handles all of this for you: it reads chara and ccv3 chunks, validates the JSON against both V2 and V3 specs, and writes back PNGs whose tEXt chunks survive any reasonable redistribution path. If your library has lost metadata along the way, the converter can rebuild from a JSON backup; if your card looks fine but imports empty somewhere, the linter tells you exactly which step in the chain dropped it.
Stop debugging blank imports one user at a time. Get a single workbench that knows which CDN, which converter, and which editor just ate your tEXt chunk.