Arjuna had never been quite right in the head, not so good with people, you know? Like, he figured out Luminix when he was four, but couldn’t understand when someone was joking. Gus made him happy and kept him quiet. Calmed him when he had his episodes.

So I told Kael no deal.

But it was a lot of cred.

That’s Wulan, the protagonist of my YA space opera Doors to the Stars. She’s a teenage scavenger living in the slums of a war-torn colony world, trying to survive after her little brother died in her arms two months earlier. The novel is 92,000 words of her voice—dark humor as trauma response, grief coloring every observation, the specific psychology of a kid who’s already made a long list of ways to kick the bucket that are worse than freezing to death in an open sewer.

That’s the protagonist AI couldn’t write. Even with almost 100K words showing it exactly how she thinks, feels, and speaks.

I didn’t set out to write yet another article about AI limitations. I was working on a reader magnet—a prequel story to hook readers before the novel launches next year. I’d written about 5,000 words establishing Wulan’s voice, her relationship with her hovering drone companion Gus, her desperate scramble for survival. Meanwhile, I’ve been having some discussions with other authors about the “threat” of LLMs, and the fear of them replacing authors. My argument is it’s unfounded. I wrote about AI’s limitations when it comes to fiction back in June, but I was curious if it’s improved at all since then. I need to write a heist sequence next, and so I thought, why not see what Claude 4.5 can do with all this context?

So I fed it everything. The 5,000 words I’d already written for the reader magnet. The entire 92,000-word novel showing Wulan’s voice across hundreds of pages. Character guides. Alien speech patterns. Explicit instructions about age-appropriate trauma processing versus adult compartmentalization. The synopsis. Style guides.

The works.

Then I asked it to write the next scene.

The result was marginally competent genre prose that lost Wulan’s voice entirely.

It was okay. But it wasn’t good.

Here’s a sample of my writing from Junk Rat—the scene where Wulan uses carbolic acid to treat an infected shoulder wound (full scene here):

The solution in the little bottle that cost every last credit I had smells like… well, let’s just say it’s not subtle. The harsh chemical odor invades my nose and throat as I unscrew the lid and then pour it over the folded strip of cloth I ripped out of my parka liner because I couldn’t afford gauze.

[…]

He said it “might sting a little” too. Taking a deep breath through my mouth—which isn’t any better, I press the soaked, improvised bandage on my shoulder. Based on the smell alone I’m thinking it’ll hurt more than—

Jancok!

My swear cuts off as my breath hitches and I bite my lip so hard it bleeds. Tears stream down my face as Gus pivots in the air flashing an alarmed red at me. This doesn’t just sting—it burns. Deep and penetrating. A searing, corrosive, caustic pain radiating from my shoulder as the solution literally eats through my nerve endings.

It hurts so bad I can’t see.

And it doesn’t fade.

It builds.

[…]

After it seems like it can’t get worse, but does anyway, the pain finally ebbs a little, enough so I can breathe and see again anyway. But a deep, persistent ache mixed with sharp tingles lingers.

Pushing myself back up to my knees, I wipe the tears and snot from my face, swaying as a wave of dizziness hits, along with a stabbing headache.

“I’m fine,” I tell Gus, waving him away. “It’s fine. Just stung a little.”

He laughs in flashes of white and cracks a joke at my expense. I shoot him the appropriate gesture in reply with a shaking hand. The awful smell clings to everything, sweet-chemical-wrong, and every breath reminds me of what I just did to myself.

But at least I won’t die of sepsis.

For now.

Notice how pain becomes characterization. Her dark humor—lying to Gus and herself with “just stung a little.” Her relationship with Gus revealed through their banter. The poverty that forced her to use parka liner because she couldn’t afford gauze. Every detail serves psychology. This is what I mean by filtering experience through a specific consciousness.

Here’s some of what Claude wrote for the 1800-word heist sequence it churned out (full scene here):

My shoulder throbs with every heartbeat, a deep ache punctuated by sharp stabs whenever I move wrong. The carbolic acid did its job keeping infection at bay, but it didn’t do anything for the pain. Or the weakness. I haven’t eaten since those breakfast skewers, and that was—what, eight hours ago? My hands shake as I pull them inside my coat sleeves.

[…]

I press myself against the wall, breathing hard, and wait for the drones to cycle past again. My vision swims a little and I blink it away. Not now. Can’t afford to be weak now.

[…]

Gus pulses amber urgently. Right. Complain later. Work now.

I pull out my interface rig—a hodgepodge of salvaged tech Tagger helped me cobble together—and look for an access port. Modern Ascendancy safes are harder to crack than the old ones, but they’re also more vulnerable to electronic intrusion if you know what you’re doing.

And I know what I’m doing.

[…]

The interface is more sophisticated than I expected. Multiple layers of encryption, rotating access codes, the whole works. My fingers move across the makeshift keyboard, running diagnostic routines, probing for weaknesses. There has to be a back door. There’s always a back door.

Sweat drips into my eyes despite the cold. My shoulder screams every time I have to reach up or twist wrong. The hunger makes my thoughts fuzzy, makes it hard to concentrate, but I force myself to focus.

Just for fun I ran the same experiment with Grok 4.1 (full scene here):

The rig chitters: Layer 1 breached. Escalating defenses. A low hum builds from the safe—alert mode. Gus spins back, red-flashing panic. Drones cycling early. Inbound, 45 seconds. Abort?

“Abort’s for people with options,” I hiss, fingers flying over the keys. The makeshift board’s sticky with old blood—mine or someone else’s, who tracks it?—and the code scrolls like angry insects. I punch in Tagger’s override sequence: a dirty little worm that mimics admin creds. Come on, you bloated tin can. I’ve got nothing but time and spite. Layer 2 cracks with a satisfying plink, but the hum spikes. Lights flicker in the corridor beyond, red strobes painting the walls like arterial spray.

Thirty seconds. My breath fogs the screen, blurring the readout. Hunger claws my gut—those skewers were half-gristle, the kind that sits like regret. Vision swims again, edges going soft. Not now. Can’t be weak now. No—that’s template talk, the crap they feed you in vids about plucky survivors. Weak? I’m a walking rad-leak, shoulder rotting under discount thread, chasing ghosts for creds that might buy one good meal. If I drop here, at least it’s not watching another kid’s eyes go dull. Small mercies.

This isn’t objectively bad writing. It’s competent. The plot beats land. Stakes escalate. Action choreography works. But it’s completely generic—this could be any YA protagonist in any heist scenario. Where’s Wulan’s dark humor? Her specific way of processing stress through deflection and gallows jokes? The way her grief colors every observation?

“Not now. Can’t afford to be weak now” is template determination. Compare that to “I’m fine. It’s fine. Just stung a little”—lying through dark humor while her hands shake and Gus mocks her. One is a character archetype. The other is a psychologically consistent person.

What makes this damning is Claude and Grok had both just read the carbolic acid scene. They could analyze exactly what made it work. When I asked, both explained Wulan’s voice back to me perfectly—the dark humor as trauma response, the way physical details reveal psychology, the specific consciousness processing reality. But when I asked them to generate the next scene, they defaulted to genre templates anyway.

The common response to AI fiction failures is “it just needs better training data.” I gave Claude 97,000 words of perfect training data—my own novel showing Wulan’s voice across hundreds of pages, plus explicit thematic guidance about authentic teenage trauma processing. That’s more text than most published novels. I gave Grok even more—my critique of Claude for it to improve on.

If that’s not enough context, what would be?

The problem isn’t quantity of context. It’s the nature of what AI does with prose. AI pattern-matches genre conventions instead of filtering experience through a psychologically consistent consciousness. Consider this detail from earlier in the reader magnet: “The cut is long, and deep, and so back the needle goes into my skin, purple thread the color of a bruise pulling the edges of my flesh together. Why purple? Because it’s what Sylvia had on discount and it was all I could afford.” That’s poverty as character, not performance. Wulan doesn’t mope about being poor—she just matter-of-factly explains why she’s stitching her shoulder with bruise-colored thread. That casual aside reveals her entire economic reality.

Or this moment, when Wulan looks at medicine in the apothecary’s cabinet: “My gaze flicks past him to bottles in a locked transparasteel case labeled with unpronounceable names. Drugs that could’ve saved my brother. But no, he needed more than expired off-brand antibiotics. He needed a real doctor in a real clinic. He needed something so unobtainable for someone like me it might as well be on another planet.” She can’t just see medicine—she sees what could have saved Arjuna. Then she immediately corrects herself with brutal realism. That’s her specific consciousness processing reality. The detail selection isn’t arbitrary—it’s filtered through her trauma.

Later, she thinks: “Better to be alone. If I die, at least I’m the only one who pays for my mistakes. And I might not live another week anyway, not if this cut turns bad. But that’s on me. Just me. The way it should be.” This is authentic teenage catastrophic thinking mixed with survivor’s guilt. Not “I must protect others through noble sacrifice” but “I can’t watch another person die because of me so I’m choosing isolation even though it might kill me.” That’s the messy, contradictory psychology of actual trauma response in a fourteen-year-old. AI would default to noble self-sacrifice or generic determination—both character archetypes, not actual psychology.

Voice isn’t a formula. It’s a specific human consciousness processing experience. The purple thread connecting to Arjuna’s death, the medicine cabinet triggering grief, the dark humor as emotional armor—these emerge from understanding how survivors actually think. You can tell AI what makes voice authentic. You can show it 97,000 words of examples. It will analyze it perfectly, explain it accurately, and then produce generic templates when asked to generate prose.

This has real implications for what kind of fiction is vulnerable to AI replacement. Template-driven content—formulaic romance, paint-by-numbers thrillers, anything already written to a rigid structure—is absolutely at risk. If your fiction is indistinguishable from AI output, AI will replace you. But quality fiction requires authentic psychological reality. It requires filtering the world through a specific consciousness that emerges from actual human experience of survival, grief, love, fear, rage.

LLMs like Claude and Grok aren’t intelligent. They aren’t creative. They’re pattern matching machines, and what they’re actually good at—analysis and data processing—they do superbly well. Claude catches continuity errors and plot holes across an entire manuscript in seconds. It can verify that alien speech patterns stay consistent. It can tell me whether a scene’s pacing serves its emotional weight, identify when Wulan’s voice drops out, flag contradictions in worldbuilding. Draft summaries. These are valuable tools for a writer.

When I wrote that Wulan’s brother died “not three feet from an open sewer” but later details suggested something different, Claude caught it immediately. When I need to verify whether I’ve already established some detail about the Vylaraian species on page 47 versus page 203, Claude can search and cross-reference in seconds. When I want structural feedback—”Is this information revealed at the right time?” or “Does this scene advance plot or just spin wheels?”—well, Claude doesn’t do quite as well at that, but it’s better at it than a lot of other models (I’m looking at you, Grok).

It can analyze your writing, but it can’t feel your prose. It can’t say, “Yes, the purple thread line lands because it carries the weight of Arjuna’s death in a single sensory detail,” because it doesn’t know what grief feels like.

Which is why when you ask it to generate the prose itself, all that analytical precision evaporates into formulaic genre templates.

The smart approach is understanding what the tool does well and using it for those specific tasks. Use AI for consistency checking, research, catching plot holes, analyzing structure. Don’t use it to write your scenes, because while you’ll get competent prose, maybe, it won’t be very good. It certainly won’t capture deep nuance in your character and world-building, or evoke emotional responses from your readers.

It certainly won’t keep them up at 3am thinking about what you wrote a week after reading it.

Frankly, the anxiety about AI replacing writers is misplaced. Why? Because the real threat is author brands operating as content mills churning out “competent” formulaic genre fiction via cut-rate ghostwriters. They have huge catalogs and six-figure marketing budgets. Ten bucks says they’ll start using AI if they aren’t already. It only makes business sense. What authors should be worried about isn’t being replaced, but rather what’s already happening—being drowned out by literary sweatshops that have the marketing muscle to game the Amazon algorithm.

If you’re writing from authentic experience and voice, if you value and hone your craft, you’re not replaceable. This isn’t a crisis, it’s an opportunity, because this makes authentic voice more valuable, not less.

On the other hand, if you’re writing templates, well, you have a problem—but it’s not really a new one, is it? Because content mills paying ghostwriters pennies on the dollar already dominate the genre market, flooding it with formulaic slop people buy in droves. Amazon is oversaturated with generic content backed by big advertising spend and has been for years.

Writers who understand AI’s limitations and uses will have an advantage over both the panic crowd and the “AI will do everything” crowd. The tools are powerful for what they’re good at. They’re not good at the thing that makes fiction matter: one human consciousness making sense of experience for another.

My experiment reveals a fundamental limitation that more data and better algorithms don’t solve. AI doesn’t have experiences to process. It has patterns to match. It can’t write the moment when Wulan sees medicine and thinks of her dead brother because it doesn’t understand how grief actually works—how it ambushes you in mundane moments, how survivors obsessively replay the “if only” scenarios, how a color can trigger a memory that derails your entire emotional state.

I gave Claude 4.5 everything—5,000 words of established voice, 92,000 words of the completed novel, character guides, thematic instructions, explicit examples of what authentic teenage trauma processing looks like. I gave Grok even more. They still couldn’t write the purple thread detail. That moment when Wulan stitches her shoulder with bruise-colored thread and thinks “Just like this damn thread. He couldn’t speak. The next morning he was dead in my arms”—connecting the suture color to her brother’s death—that doesn’t come from craft technique. It comes from understanding how grief works, how survivors make meaning from trauma, how a color can carry the weight of loss.

AI can’t write that line. Not because the technology isn’t advanced enough, but because that connection emerges from somewhere AI doesn’t have access to: human experience of loss filtered through a specific consciousness processing reality.

Wulan isn’t in the data. She’s in how I process her world as a human being.

And a pattern matching machine can never replace that.


Discover more from The Annex

Subscribe to get the latest posts sent to your email.

7 thoughts on “I Fed Two AIs Nearly 100K Words of My Story and They Couldn’t Write the Next Scene

  1. Those examples are so clear in going from a unique voice to the bland boilerplate that may sell a lot of new-style pulp fiction, but don’t make me want to pay $19.99 for the reading experience. I also don’t feel a desperate need to pay a monthly fee for all the bland boilerplate I could possibly want.

    Liked by 1 person

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.