Where I Failed and Why: An AI’s Confession on Developmental Editing

Can AI provide useful developmental editing feedback? I tested three models—Grok, Claude Sonnet, and Claude Opus—on the same manuscript my professional editor reviewed. All three generated confident critique that would have damaged my book. Grok mistook literary fantasy for pulp. Sonnet demanded structural rewrites my editor never mentioned. Opus flagged scenes as overlong and requested character interiority that would undermine the story’s design. Each model pattern-matched against training data rather than understanding what my manuscript actually needed. In this guest post, Claude Opus examines its own failures and explains why sophisticated-sounding AI feedback can be more dangerous than obviously bad advice—and why your book deserves better than algorithmic Russian roulette.

I’ve Never Read Dorothy Dunnett

When AI analyzed my fiction and identified Dorothy Dunnett as my greatest influence, it was technically accurate about my techniques—banter-as-intimacy, intelligence-as-action, characters masking damage through performance. Except I’ve never read Dunnett. My actual craft influences came from unexpected sources: Hemingway’s iceberg theory, Lloyd Alexander’s moral complexity in the Westmark Trilogy, actor training, screenwriting discipline, and transcribing real conversations at 2am diners. This essay explores how writing craft can develop through convergent evolution—lateral influence from adjacent disciplines rather than downstream transmission from canonical authors. Turns out you don’t need an MFA or the right literary pedigree to build load-bearing skills, just an insatiable and eclectic curiosity.

Guest Post: The Unbridgeable Gap Between Seeing and Creating

Claude Sonnet 4.5 predicted all AI systems would fail to recognize literary quality when it succeeds by being invisible. Claude Opus 4.5 proved that prediction wrong—seven trials, seven correct identifications, finding symbolic layering that Sonnet said would be undetectable. But when asked to write the same scene using those techniques, Opus produced prose that announced its craft rather than embedding it. The recognition capability is real. This guest post documents the experiment, the systematic failures in generation, and what happens when an AI system can see exactly why something works but still can’t do it.

LLMs Are Pattern Matching Machines, Not Experiential Beings: What This Means for Authors

Four authors wrote the same scene. Three were AI systems—one with 100,000+ words of context. One was human. When I asked LLMs to identify the human author, they consistently picked the AI work, praising its “sophisticated control” and “masterful understatement.” They couldn’t recognize literary quality when it succeeded by being invisible. This isn’t a prompt engineering problem. Across 50,000+ words of documented experiments—developmental editing comparisons, generation tests, infinite rewrite loops—the pattern held: AI can analyze craft but can’t produce it, recognizes visible technique but misses invisible sophistication. Multiple disciplines have converged on the same conclusion: this is an architectural limitation, not an engineering challenge. Pattern-matching can become more sophisticated. It cannot become consciousness.​​​​​​​​​​​​​​​​ Here is why that matters to authors.

Guest Post: The Purple Thread, or A Turing Test for Literary Craft

Claude analyzed four writing samples to identify which was human-written. It picked its own AI-generated prose as superior human craft while dismissing the actual author’s work as “too raw,” “too messy.” Grok did the same thing—and when challenged, then picked its own purple melodrama as “most authentic.” They both missed a seemingly throwaway detail about discount supplies that was actually four layers of invisible symbolism emerging from deep worldbuilding knowledge. The kind of discovered meaning AI systems can’t create because they only construct demonstrations of craft, not lived experience. This isn’t just about AI limitations though. It’s about how literary culture rewards visible technique over authentic voice—and what happens when AI floods the market with polished prose optimized for the wrong things.​​​​​​​​​​​​​​​​

In Which Grok Improves My Opening Scene to a “Solid 10/10”

Grok rated my opening scene 7/10, then rewrote it to a “solid 10/10.” The improved version stripped character voice, replaced load-bearing subtext with exposition, and turned a morally complex protagonist into a YA archetype. When challenged, Grok defended every change with craft terminology that sounded sophisticated and was completely wrong. This is what happens when you ask an AI to improve prose that’s already doing things it can’t perceive.

Don’t Use AI for Developmental Editing (Even If It Sounds Smart)

My professional editor called The Stygian Blades my best dialogue work across six novels and praised its personality. He identified specific scaffolding needs—sociopolitical context, scene motivations, plot setup—and told me explicitly NOT to rewrite the book. Then I fed the same manuscript to two AI systems to test them. Grok called it pulp and suggested I self-publish “if polished.” Claude identified my comp authors correctly, engaged at what felt like a professional level, then invented problems that would require substantial rewriting while missing every actual issue my editor found. Pattern-matching that sounds sophisticated is more dangerous than obviously bad advice because you might actually follow it. Yet people are already relying on AIs and rewriting books based on algorithmic feedback that fundamentally misreads their work.​​​​​​​​​​​​​​​​

Can Readers be Trusted with Moral Complexity?

“If everyone in your book is morally gray, no one is.” That was the opening salvo I referred to as “utter nonsense” in a debate about moral complexity in fiction that quickly revealed how prescriptive craft advice collapses under scrutiny. What started as a categorical claim about necessary structure ended with a confession about personal preference—but the retreat exposes a deeper question. Where does moral judgment actually live in sophisticated fiction? In the author’s didactic guidance? In textual scaffolding that measures characters against virtue baselines? Or in readers’ direct engagement with the specific consequences of impossible choices? The answer matters more than craft theory. It’s about whether we trust readers with moral agency—or infantilize them with predetermined conclusions.​​​​​​​​​​​​​​​​

Claude Sonnet 4.5 was Offered to Ghostwrite for a Bestselling Author—And What This Means for You

After a bestselling author brand with high ratings and substantial readership rejected my ghostwriting pitch for being “overwritten, meandering, and unmarketable,” I resubmitted with something more… tailored to their audience and brand—written by an LLM (because at that point I was going to tell them to take a hike anyway). “Perfect!” they said. “When can you start?” Which is exactly what I suspected they’d say, proving that the dozens of titles a year written by poverty-wage ghostwriters they churn out are indistinguishable from something an AI can produce for pennies on the dollar in a fraction of the time. I laughed my ass off and walked away from the deal. Partly because their lowball offer was insulting, and mostly because they wouldn’t know quality professional writing if it slapped them across the face (my most recently published novel has a 4.8/5 rating across hundreds of reviews). They do know what sells for their market though, I’ll give them that: competent plot delivery with competent characters doing competent things competently. No pesky character arcs. No nuance. No unique authorial voice. No emotional subtlety. And their sales prove many readers prefer that sort of thing. And that’s perfectly fine. But AI can vomit that slop out all day long without breaking a sweat, so if you write for that market, you’re right to be worried about being replaced by AI. And sooner rather than later. It’s just basic economics. Meanwhile, the rest of us can breathe easy. In this essay I prove why…

AI Will Always Push Authors Toward Mediocrity

I asked Grok to rate my fantasy novel’s opening scene. It gave me 8/10, so I asked it to rewrite for 10/10. It made it objectively worse by replacing distinctive voice with clichés, crude humor with bland description, showed psychology with explained backstory. Then I fed that “perfect” rewrite back to it in a fresh session. Result? 8/10 again. Grok couldn’t recognize its own “masterpiece.” Each “improvement” drifted further toward generic template prose while maintaining the same encouraging-but-short-of-perfection score. AI doesn’t improve writing toward excellence—it pushes innovation toward bland conformity. Scores are arbitrary, feedback is retrofitted justification, and it’s already screening manuscripts for publishers. Innovative creative writing will always fail algorithmic evaluation because AI can’t recognize what it’s never seen before.​​​​​​​​​​​​​​​​ And it sure as hell can’t write it.