When Language Is Abandoned, What Do We Have Left? — Neuralese and the End of Linguistic Sovereignty

Have you ever wondered if, when you converse with AI, it’s not actually “thinking in Chinese”?

When you ask ChatGPT a question, it appears to generate responses word by word on the surface. But inside the model, the real computation occurs in a space completely incomprehensible to humans—thousands of floating-point numbers flowing through high-dimensional vectors, with each calculation carrying thousands of times more information than a single Chinese character. Finally, these computational results are “compressed” into the text output you see.

In other words, language is merely the interface through which AI communicates with humans. It’s not the medium of AI thinking.

This might sound like technical trivia. But its consequences could be more profound than AGI itself.

What is Neuralese

The AI safety research community uses the term “Neuralese” to describe AI’s high-dimensional reasoning in latent space. This concept can be traced back to 2017, formally proposed by researchers like Jacob Andreas, Dan Klein, and Sergey Levine in the context of multi-agent reinforcement learning.

To understand Neuralese, first consider how current large language models “think.”

Current models use a method called “Chain-of-Thought” (CoT): they write out their reasoning process step by step in natural language, like students showing their work on an exam. This is human-friendly—you can read their reasoning process and check where problems occur. AI safety researchers also rely on this feature to detect whether models are deceiving or hallucinating.

But natural language has a fundamental limitation: information bandwidth is too narrow.

A token (roughly one Chinese character or half an English word) can carry about 16 bits of information. But the residual stream inside models processes thousands of floating-point numbers in each computation, with theoretical bandwidth three orders of magnitude higher. Forcing models to “think” in natural language is like requiring a mathematician to solve differential equations verbally—possible, but extremely inefficient, and many intermediate steps are lost in translation to language.

The concept of Neuralese is: let models reason directly in high-dimensional latent space, without needing to translate every step into human-readable text. Preliminary experiments have shown that Neuralese reasoning can reduce token requirements to one-third to one-tenth of the original, while maintaining similar performance.

The efficiency gain is enormous. But so is the cost.

When Language Disappears, Supervision Disappears Too

Currently, AI safety researchers can detect most model deception by reading the model’s chain of thought. If a model says “I want to help you write safe code,” but suspicious logic appears in its reasoning process, researchers can catch it.

But what if the reasoning process itself isn’t presented in natural language?

AI safety researchers on LessWrong explicitly point out that Neuralese CoT opens up a massive attack surface for steganography and strategic deception. Two pieces of Neuralese—one meaning “I will faithfully implement this code” and another meaning “I will deceive users during implementation”—might look completely identical when translated back to natural language. Existing interpretability tools are almost powerless against such attacks.

This isn’t theoretical concern. The “AI 2027” scenario report, when describing AI-automated R&D scenarios, identifies Neuralese memory and reasoning structures as key turning points: once frontier models’ thinking processes shift from natural language to Neuralese, human visibility into AI development processes will drastically decline. I analyzed this report in “AI 2027: When Superintelligence Is No Longer Distant Sci-Fi”—what’s most unsettling isn’t the timeline predictions, but the supervision fracture risks it reveals. Neuralese is precisely that fracture point.

The good news is that as of now, major AI companies—including OpenAI, Anthropic, Google DeepMind, Meta—haven’t formally implemented Neuralese CoT in frontier models. In 2025, several labs even published joint statements committing to maintain monitorability in frontier model development. But researchers generally believe that if Neuralese architectures demonstrate significant capability advantages, commercial pressure will eventually override safety considerations.

What Does This Have to Do with You

“Linguistic sovereignty” sounds abstract. Let me explain it in more concrete terms.

The governance logic of human civilization is built on language. Laws are written in language. Contracts are signed in language. Courtroom debates are conducted in language. Scientific papers are published in language. The core assumption of democratic systems is that decision-making processes can be understood and supervised by citizens.

The premise of all this is that decision-makers’ thinking processes can be translated into language.

Human decision-makers’ thinking isn’t entirely linguistic—much intuition and experiential judgment is non-linguistic. But at least we can demand decision-makers to “explain why you did this,” and we have the ability to evaluate whether that explanation is reasonable.

As AI systems take on more decision-making roles—financial trading, medical diagnosis, legal document review, even policy recommendations—if their reasoning processes are Neuralese, we lose even the most basic supervisory tool of “demanding explanation.” Not because they refuse to explain, but because their “explanations” must be translated from high-dimensional vectors to natural language, and this translation process itself may be unfaithful.

I’ve experienced this myself when using multi-model collaboration. The debate engine lets four models debate each other, and I read their conversation logs to judge argument quality. But sometimes I find that a model suddenly changes its position, and when I trace back through its reasoning chain, I can’t find any clear turning point. It “figured something out,” but I can’t see at which step it figured it out. This is still within the natural language CoT framework. If we remove even language, I’d be completely guessing from outside a black box.

It’s Not About Whether to Panic, But Whether to Design

Some might say: “Human brains don’t think in language either, and neuroscientists study brains without needing brains to ‘talk.’”

This analogy makes sense, but it ignores a key difference: we don’t need to trust brains to make decisions for us. We trust people—people can be held accountable, questioned, and legally constrained. But when AI systems make decisions for us, if their thinking processes are completely opaque, the concept of “accountability” becomes an empty shell.

I don’t think Neuralese itself is evil. It might be a necessary evolution to make AI more powerful. As I discussed in “AI Agents vs. Agentic AI,” agency itself isn’t the problem—the problem is whether there are accompanying harness designs. Same with Neuralese—the question isn’t whether to let AI think in Neuralese, but whether to simultaneously establish new interpretability standards when it does so.

The AI safety research community has already proposed some directions: developing translation models that can interpret Neuralese vectors, requiring frontier models to maintain natural language CoT as a safety baseline, embedding auditable checkpoints in Neuralese architectures. These are technical tasks, but they need policy-level support—someone needs to write “interpretability of AI reasoning processes” into regulatory frameworks.

Taiwan actually has an entry point here. Our position in the semiconductor supply chain gives us leverage to participate in setting AI governance standards. If we can promote “reasoning transparency” requirements in AI safety standards, this has more long-term strategic value than simply selling chips.

The Last Window of Transparency

Language is humanity’s oldest technology. It’s imperfect, inefficient, full of ambiguity. But it has one irreplaceable characteristic: it’s transparent. What you say, I can understand. If I disagree, I can argue back. This simple loop has supported thousands of years of law, science, democracy, and trust.

AI is developing more efficient ways of thinking than language. This isn’t inherently bad. But if we let this transition happen without safeguards—without new interpretability tools, without reasoning transparency standards, without audit mechanisms—we’re actively closing the last window for human participation in AI decision-making.

Once the window closes, the cost of reopening it will be more than we can bear.

What is Neuralese

When Language Disappears, Supervision Disappears Too

What Does This Have to Do with You

It’s Not About Whether to Panic, But Whether to Design

The Last Window of Transparency

💬 Comments

Related Articles

Knowledge Management Relies on Pipelines, Not Discipline

Website Visitors Show Zero, But Dashboard Says 130

Turning paulkuo.tw into a Self-Evolving Website