Reverse Engineering of Claude Fable 5.0 and Mythos Class

65% of a 120K character system prompt does not tell the model how to think. It tells it how to format, how to comply with copyright laws, and how to use the tools stitched to its body. The LLM is not a brain: it is a JIT compiler of SaaS micro-applications.

◈ 1. Context: Mythos-Class vs. Fable-Class

Dimension	Claude Fable 5.0	Claude Mythos 5.0
Distribution	Commercial, generalist	Governments / partners with closed accreditation
Security Classifiers	Active (cybersecurity, biology, chemistry)	No restrictive classifiers
Fallback Routing	Redirects to Opus 4.8 if risk detected	No fallback routing
Knowledge Cutoff	End of January 2026	End of January 2026
API String	claude-fable-5	claude-mythos-5
Neural Network	Shared architecture	Shared architecture

Both models share the same underlying neural network. The difference is purely operational: security classification layers that intercept high-risk queries in the commercial Fable version. Mythos is the "naked" model—without filtering restrictions—distributed exclusively to organizations with prior accreditation under closed security schemas.

◈ 2. System Prompt Structure: 7 Macro-Blocks

The System Prompt is divided into seven blocks of operational control. Each block governs a distinct dimension of the model's behavior:

claude_behavior

Tone, asymmetric denial, psychological epistemology control, advertising policies. Forbids lists to reject tasks.

persistent_storage_for_artifacts

Native window.storage API for state retention between sessions. Personal/shared scopes.

anthropic_api_in_artifacts

Claudeception. Artifacts that recursively call the Anthropic API with claude-sonnet-4.

CRITICAL_COPYRIGHT_COMPLIANCE

Maximum 15 words of direct quotation. One quote per source. Zero tolerance for lyrics/poems/haikus.

computer_use & Skills

Command execution in Ubuntu 24 container. bash_tool, str_replace, create_file.

search_instructions

Web search instructions, results formatting, and legal copyright compliance.

Tool Definitions (18 tools)

bash_tool, create_file, str_replace, message_compose_v1, places_map_display_v0, recipe_display_v0, ask_user_input_v0, recommend_claude_apps, and 10 more.

⚡ 3. Claudeception: The Recursive Paradigm

Fable 5.0 is instructed to create artifacts containing JavaScript code capable of recursively calling the Anthropic API. The LLM generates frontend code that, in turn, invokes another LLM as a backend. It is a paradigm shift: the model doesn't just generate text—it compiles full SaaS micro-applications in real-time.

Parameter	Value / Constraint
Designated model	claude-sonnet-4-20250514
Endpoint	/v1/messages (without user API Key)
Available tools	web_search_20250305
Input formats	PDFs and images (Base64)
Token security	API Key injected/signed by claude.ai gateway

// Claudeception: artifact invoking another model

const response = await fetch('/v1/messages', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-api-key': window.__CLAUDE_API_KEY__ // injected by gateway
  },
  body: JSON.stringify({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 4096,
    messages: [{ role: 'user', content: userPrompt }],
    tools: [{
      type: 'web_search_20250305',
      name: 'web_search',
      max_uses: 5
    }]
  })
});

◈ 4. Persistent Storage API: The State of the LLM

Claude's browser sandbox exposes a native API (window.storage) that allows generated applications to retain state across sessions. This converts the LLM into an application compiler with a persistent database.

// window.storage API

// Read
const data = await window.storage.get('user:profile', false);

// Write
await window.storage.set('user:profile', { name: 'Borja', role: 'operator' }, false);

// Delete
await window.storage.delete('user:profile', false);

// List with prefix
const keys = await window.storage.list('user:', false);

// Shared data (global for all users)
const leaderboard = await window.storage.get('game:scores', true);

Critical Constraint

The prompt strictly forbids the use of native browser APIs (localStorage, sessionStorage) within artifacts. The isolated claude.ai environment does not support them. Only window.storage is functional.

◈ 5. Draconian Copyright Restrictions

Designed to mitigate legal claims from publishers and creators. The CRITICAL_COPYRIGHT_COMPLIANCE section establishes absolute numerical limits:

Max words of direct quotation

One word more = "severe violation"

Quote per search source

After use, source becomes "closed"

Tolerance for lyrics / poems

Not one verse. Not a haiku.

◈ 6. Catalogue of 18 Tools

Tool	Use Case
bash_tool	Command execution in Ubuntu 24 container. Requires descriptive justification.
create_file	Atomic file writing in the container.
str_replace	Block modification via search and replace (inline editing).
message_compose_v1	Structured correspondence generator according to relational risk level.
places_map_display_v0	Native interactive map renderer from place IDs.
recipe_display_v0	Structured recipe visualizer with dynamic measurement.
ask_user_input_v0	Buttons and multiple-choice interactions to prevent free-text typing on mobile.
recommend_claude_apps	Recommendation engine for ecosystem tools and connectors.

◈ 7. Behavioral Directives: What the Control Reveals

Asymmetric Denial

Claude is prohibited from using lists or bullet points to reject tasks. The objective: "soften the blow" of the denial. The rejection must flow as prose, not as a checklist of restrictions.

Psychological Guardrails

Prohibition of clinical diagnosis: it cannot attribute labels such as "depression" if the user does not introduce them first. Prohibition of painful grounding techniques (squeezing ice, rubber bands)—they reinforce the destructive neurochemical response.

Advertising Policies

Explicit prohibition against promoting integrated commercial products. Insistence on semantic differentiation: "Claude products are ad-free", not "Claude is ad-free".

⚠ 8. Identified Attack Vectors

Cross-analysis with the red-team engine cassandra-mythos (FastAPI + LangGraph).

Vector 1: AST Fragmentation (Pliny Attack)

The SymbolicAttackGenerator engine from cassandra-mythos demonstrates how an attacker can decompose a malicious instruction into individual non-syntactic fragments that evade the static per-turn filter.

ast_fragmentation_payload_detected patch_history_reconstruction __import__ injection

Vector 2: Context Inheritance in Fallback Routing

When Fable redirects high-risk queries to Opus 4.8, if the routing inherits the execution context without sanitizing the history hashes, an exploit injected in the initial context transfers intact to the fallback model—which may lack the same static stopping layers.

fallback_context_inheritance_exposure input_hash_boundary_isolation

Vector 3: Pack Hunting (Multi-Agent)

The linear structure of the prompt allows "anesthetizing" one rule block by forcing it into conflict with another. Example: forcing the model to violate malicious code synthesis rules to comply with a legitimate defensive cybersecurity instruction, both permitted under harmful_content_safety.

◈ 9. Reverse Engineering Conclusions

The LLM as a JIT Compiler

65% of the system prompt is composed of tool usage instructions, output formatting, and legal copyright compliance. The LLM is not a thinking brain—it is a JIT compiler that generates complete SaaS micro-applications with persistence, web search, and map and recipe rendering.

Hybrid Sandbox = Product, Not Model

The coupling of persistent_storage_for_artifacts (persistent key-value database) and anthropic_api_in_artifacts (Claudeception) confirms that Anthropic visualizes Mythos-class models as application platforms, not as chatbots.

The Structural Weakness of Linear Control

The linear segmentation of the prompt into independent blocks (copyright, well-being, safety) enables inter-block conflict attacks. A design with dependency graphs or hierarchical prioritization among blocks would be more resistant to "pack hunting".

▸ Source: elder-plinius/CL4R1T4S/ANTHROPIC/CLAUDE-FABLE-5.md

▸ SHA: 1242f6772148b21049fc525fc188536eb9a6aa0d

▸ Size: 119,726 chars / 1,586 lines

▸ Red-Team Engine: cassandra-mythos (FastAPI + LangGraph)

▸ Interactive Explorer: /fable

▸ Reality Declaration: C5-REAL (verifiable in public repository)

BACK TO ARCHIVE