Intelligence & Order

Six Dollars for 447 Translations: One Person's Cross-Platform Content Infrastructure

AI-Driven Content Pipeline × Quad-Lingual Translation Automation × Wiki Knowledge Graph × AEO Dual-Layer Indexing: Engineering Record of Building a Four-Language Content Pipeline for $5.99

Paul Kuo 郭曜郎 May 2026 9 min read AI-translated from the Chinese original

TL;DR — 114 articles × 4 languages = 447 files, total translation cost $5.99 USD. This records engineering practices of three content pipelines: Claude Sonnet-driven quad-lingual translation pipeline, Whisper + Haiku-driven Wiki knowledge graph, and llms.txt dual-layer indexing AEO strategy. The design core of each pipeline follows the same principle: manifest-driven idempotency.

▶ Listen to summary

AI-synthesized voice, cloned from the author's own voice

On May 14, 2026, I spent three hours manually translating an article into English, Japanese, and Simplified Chinese. Only after finishing did I discover there was already an automatic translation script in the repo.

Worse, my Japanese version used “desu/masu” polite style, but the script’s built-in guidance was “da/dearu” plain style. Completely inconsistent with the tone of over a hundred already-translated articles. All three language versions had to be re-translated.

Three hours of manual work, replaced by one command: node scripts/translate-article.mjs. This experience made me seriously audit what I’d actually built—and I discovered that, without realizing it, a fairly complete content infrastructure had grown organically.

How Does Manifest-Driven Translation Pipeline Translate 447 Articles for $5.99?

Every article on paulkuo.tw has four language versions: Traditional Chinese (main), English, Japanese, and Simplified Chinese. 114 main articles × 4 languages = 447 files. All translated by a 400-line Node.js script, powered by Claude Sonnet’s Messages API.

The core of the entire system is a manifest file (data/translation-manifest.json) that records each article’s body hash and translation status for each language version. Each time the script runs, it first calculates the main article’s body hash and compares it with the manifest—same hash gets skipped, different hash gets translated.

There’s a deliberate design decision here: the hash only covers the article body, not the frontmatter. Changing cover image paths, updating readingTime, adding tags—none of these trigger re-translation. Because translation targets content, metadata changes don’t affect translation quality. This small decision saved considerable money—I frequently modify frontmatter without touching content, and if every change triggered re-translation, costs would multiply.

The three language versions each have their own translation guidelines, written into the API prompt:

English follows a measured, philosophical tone, preserving theological terms in original (Logos, Sarx, incarnation). Japanese specifies “da/dearu” tone (plain style), not “desu/masu” (polite style)—this decision became a hard rule only after hitting the pitfall mentioned earlier. Simplified Chinese focuses on character-level traditional-to-simplified conversion with necessary usage adjustments (軟體→软件, 網路→网络), without major sentence restructuring.

The cost? 67 API calls, total $5.91 USD. English cheapest (fewer output tokens), Japanese most expensive (more output tokens). Average per article for three language versions combined: $0.27 USD, about NT$9.

There was one bug that nearly caused serious problems: early versions updated the manifest’s sourceHash after each language in the translation loop. After finishing English, the manifest was already marked “translated,” so Japanese and Simplified versions were skipped. Fixed by updating the hash only after all three languages are complete—a tiny timing issue that nearly made two-thirds of translations disappear.

From Notes and YouTube to Knowledge Graph: What Does the Wiki Ingest Pipeline Do?

The translation pipeline handles “already written articles.” But content upstream—notes, YouTube videos, web clips that haven’t become articles yet—needs another pipeline.

paulkuo.tw has a Wiki system inspired by Karpathy’s personal knowledge graph model. Currently 312 ingested knowledge nodes from four sources: Dedao App notes (156 pieces), website articles themselves (93 pieces), YouTube videos (38 pieces), web clips (25 pieces). Plus 54 pieces in the pending area.

Each knowledge node has visibility grading. Public ones can go directly into Wiki; internal ones need de-identification; private ones don’t enter Wiki. Judgment rules are written in wiki_visibility.py, with one special rule: notes tagged “录音卡笔记” are meeting transcripts that get downgraded even if in public folders.

YouTube ingest is the most technically interesting segment. The script tracks 5 channels using a two-layer transcription strategy: first layer uses yt-dlp to grab subtitle files (prioritizing manual subtitles, then auto-generated); if no subtitles exist, falls back to second layer—download audio file and send to Groq’s Whisper Large V3 Turbo for speech-to-text. Audio over 24MB gets compressed to mono 16kHz 32kbps opus first.

Transcribed content doesn’t go directly into Wiki. It first enters sources_pending/ for review, only promoted after manual confirmation. After promotion comes enrichment—using Haiku 4.5 (cheapest Claude model) to add summary, key points, quotes (with timestamps), concept links. This is deliberate model selection: enrichment doesn’t need Sonnet-level reasoning, Haiku is more than sufficient for summarization and keyword extraction at one-tenth the cost.

Knowledge nodes are ultimately seeded to Cloudflare KV for use by Wiki’s search API, knowledge graph visualization, and /api/wiki/ask Q&A endpoint.

Making AI Crawlers Understand Your Site: llms.txt Dual-Layer Indexing and robots.txt Strategy

This is the pipeline furthest from “traditional content management” among the three, but I believe its importance will grow increasingly obvious in 2026.

paulkuo.tw’s robots.txt specifically names 11 AI crawler user-agents: GPTBot, ChatGPT-User, Google-Extended, PerplexityBot, ClaudeBot, Applebot-Extended, cohere-ai, Bytespider, OAI-SearchBot, Claude-SearchBot, Perplexity-User. All allowed to crawl content pages, all blocked from /api/, /auth/, /ws/ and admin backend.

Divided into two categories: training crawlers (GPTBot etc.) and search crawlers (OAI-SearchBot, Claude-SearchBot, Perplexity-User). The latter group was newly added in May 2026 because Answer Engine Optimization (AEO) logic differs from traditional SEO—you need Perplexity, Google AI Overview, ChatGPT Search to be able to read your content before they’ll cite you when answering user questions.

robots.txt is just the entrance ticket. Real AEO foundation engineering is the llms.txt dual-layer indexing, inspired by the llms.txt proposal:

First layer llms.txt is lightweight indexing—pillar-grouped article lists plus Wiki concepts, so AI crawlers can understand what the site covers in one scan. Second layer llms-full.txt is complete content—every article’s full markdown body dumped out, allowing AI engines needing deep understanding to read everything at once.

Both layers are dynamically generated during Astro build. New articles go live, Wiki adds concepts, next build automatically includes them. No manual index maintenance needed.

This system represents a very specific position within the theme of “AI and Human Order”: I choose to be open to AI, not closed. All 11 crawlers are allowed, entire site content laid open for them to read. Because in AEO logic, being cited by AI is a traffic source, not a threat.

Shared Design Principles Across Three Pipelines

Looking back at these three pipelines—translation, Wiki ingest, social publishing—they appear to do completely different things but share the same design foundation: manifest-driven idempotency.

Translation pipeline’s manifest tracks body hash, ensuring the same article won’t be re-translated. Wiki ingest’s raw_note_id comparison ensures the same note won’t be re-ingested. Social publishing’s social-logs/{slug}-*.json records each article’s publication status, ensuring the same piece won’t be re-scheduled.

Why is idempotency important? Because all three pipelines can be interrupted and rerun. Translation stops midway? Rerun once, already-translated pieces get skipped. Wiki ingest’s cron runs daily regardless of yesterday’s success. Social scheduling API timeout? Retry mechanism automatically backs off and resends.

This is the biggest difference between personal projects and enterprise systems: I don’t have an SRE team watching my pipelines, so pipelines must be designed as “can break anytime, just rerun when broken.” Manifest is the core of this design—it lets every step know where it left off.

Another shared principle is model stratification. Translation uses Sonnet (needs language quality), enrichment uses Haiku (only needs summarization capability), cover images use gpt-image-1 (visual generation), transcription uses Whisper (speech recognition). Not every task needs the strongest model. The $5.99 total cost largely comes from choosing “just enough” models at each stage, rather than uniformly using the most expensive.

Six Dollars and a Still-Growing System

Adding up costs across three pipelines: translation $5.91 + cover images $0.08 = $5.99 USD. About NT$195.

This number covers three-language translation of 114 articles (447 files), cover image generation, and all related API calls. Wiki’s Haiku enrichment and social publishing’s freeimage hosting are calculated separately but cost even less. Cloudflare’s Workers, KV, R2, Pages all within free tier.

I don’t think the $5.99 number itself is remarkable. What’s remarkable is the pipeline design makes this cost predictable, traceable, controllable. Every API call is logged in costs.jsonl: timestamp, model, token count, USD cost, TWD cost. If any day’s spending suddenly spikes, I can spot which pipeline, which article caused it at a glance in the JSONL.

This system is still growing. Wiki’s 38 concept pages are still far from enough (target 200+), YouTube ingest only tracks 5 channels, social publishing only has X and Threads truly API-connected. But the infrastructure is already there—manifest, cost tracker, visibility grading, idempotency—adding a new source or platform means just plugging in.

In my autoresearch article, I said personal IP scenario autoresearch isn’t about making websites understand machines, it’s about making agents understand you together. This content infrastructure is the concrete carrier of “understanding you together”: translation pipeline lets four-language readers understand you, Wiki pipeline systematizes your own knowledge structure, llms.txt lets AI engines understand you.

Six dollars. 447 articles. Still growing.

FAQ

Q: Is AI translation quality sufficient? Don’t you need manual proofreading?

Depends on the language and purpose. English version quality is stable because Claude Sonnet’s English output is naturally strong. Japanese is the biggest pitfall—the script specifies “da/dearu” plain style, but manual translation easily slips into “desu/masu” polite style, and mixing the two sounds weird. Simplified Chinese mainly does character-level traditional-to-simplified conversion plus usage adjustments. Paul glances at translation quality but doesn’t proofread sentence by sentence, relying on the manifest system to ensure re-translation when source changes.

Q: What is llms.txt? Why do two layers?

llms.txt is an emerging standard that allows AI crawlers to quickly understand website content structure. paulkuo.tw implements two layers: llms.txt is a lightweight index (pillar-grouped article list + Wiki concepts), llms-full.txt is complete content (full markdown of every article). Both layers are dynamically generated during Astro build, automatically including new articles when they go live.

Q: What are the ingest sources for the Wiki knowledge graph?

Currently four sources: Paul’s Dedao App notes (156 pieces), paulkuo.tw articles themselves (93 pieces), YouTube channels (38 pieces, automatically tracking 5 channels), and web clips (25 pieces). Including 54 pieces in the pending area, totaling 366 knowledge nodes. Each has visibility grading (public / internal / private) and sensitivity marking.

Q: What’s the operational cost of the entire pipeline?

As of May 2026: translation $5.91 + cover images $0.08 = $5.99 USD (about NT$195). This is the total cost for 447 translation files + cover image generation. Wiki enrichment uses Haiku 4.5 (cheapest model), social publishing uses freeimage (free image hosting) + OneUp (scheduling tool). Cloudflare Workers/KV/R2 within free tier.

Derived from 5 sources

知識管理不靠自律，靠管線
Paul 主張知識管理的真正瓶頸不在收集，而在於分類與檢索。傳統依賴自律的手動整理方式容易失效，他提出透過 API + cron + AI Skill 建立自動…
把 paulkuo.tw 變成一個自己進化的網站
Paul Kuo 受 Karpathy 的 autoresearch 啟發，將個人網站 paulkuo.tw 改造成 AI 可讀取、可測試、可持續優化的知識實體…
多模型實作：讓 Claude 與 Gemini 聯手，把網站重構成可被人讀也可被 AI 讀
Paul 實踐 Build for Models 和 Agentic Web 概念，以 Claude 與 Gemini 多模型協作重構個人網站 paulkuo.…
AI Agent 規劃指引：從踩過的坑到可複製的框架
Paul 在過去一年建構三套 Agent 系統（辯論引擎、自動發文、產線監控），從實戰踩坑中歸納出五大落地原則：明確的邊界定位、工具整合降維、模組化流程拆解、透…
OpenClaw（龍蝦）Skill 開發完全指南：從分鏡腳本到專屬 AI 工作流
OpenClaw 提供系統化的 Skill 開發方法論，將個人工作習慣與專業知識轉化為 AI 可執行的標準化流程。核心在於建立結構化的 SKILL.md 說明書…

Explore Collisions ↗

Frequently Asked Questions

Is AI translation quality sufficient? Don't you need manual proofreading?: Depends on the language and purpose. English version quality is stable because Claude Sonnet's English output is naturally strong. Japanese is the biggest pitfall—the script specifies 'da/dearu' plain style, but manual translation easily slips into 'desu/masu' polite style, and mixing the two sounds weird. Simplified Chinese mainly does character-level traditional-to-simplified conversion plus usage adjustments. Paul glances at translation quality but doesn't proofread sentence by sentence, relying on the manifest system to ensure re-translation when source changes.
What is llms.txt? Why do two layers?: llms.txt is an emerging standard that allows AI crawlers to quickly understand website content structure. paulkuo.tw implements two layers: llms.txt is a lightweight index (pillar-grouped article list + Wiki concepts), llms-full.txt is complete content (full markdown of every article). Both layers are dynamically generated during Astro build, automatically including new articles when they go live.
What are the ingest sources for the Wiki knowledge graph?: Currently four sources: Paul's Dedao App notes (156 pieces), paulkuo.tw articles themselves (93 pieces), YouTube channels (38 pieces, automatically tracking 5 channels), and web clips (25 pieces). Including 54 pieces in the pending area, totaling 366 knowledge nodes. Each has visibility grading (public / internal / private) and sensitivity marking.
What's the operational cost of the entire pipeline?: As of May 2026: translation $5.91 + cover images $0.08 = $5.99 USD (about NT$195). This is the total cost for 447 translation files + cover image generation. Wiki enrichment uses Haiku 4.5 (cheapest model), social publishing uses freeimage (free image hosting) + OneUp (scheduling tool). Cloudflare Workers/KV/R2 within free tier.

💬 Comments

← All articles