The Strange Moment
The second failure mode feels less cinematic than the sidewalk and more personally insulting. John gets stuck around the kitchen.
Not in the dramatic sense. No jump scare, no locked room, no sudden dream logic. He just keeps negotiating with household preconditions: the back door, the keys, the phone, the idea of leaving, the need to confirm that the house is secure, the key rack by the door. It is the generated-world version of leaving for the airport and touching your pocket six times because your brain has opened an incident ticket called keys maybe.
The run starts with the same general task shape: handle the phone, take or send the needed photo, then leave. The harness has learned from earlier looping and tries to suppress repeated photo sends. It also tries to keep John from rechecking the same door once it already appears closed or locked.
That is exactly where the failure becomes interesting. The model understands the domestic ritual too well. Leaving a house is not one action. It is a sequence of tiny confirmations, and the image model has a huge amount of visual memory for those confirmations. Doors, knobs, locks, key hooks, mats, thresholds, the little moment where a person turns back because something might have been forgotten.
What the System Was Trying to Do
The harness was trying to enforce sane task order. Read the message. Send the photo. Pocket the phone. Keep the keys and truck fob secure. Confirm the back door is closed and locked. Then move away from the door and continue leaving.
On paper, this is exactly what a better generated-world harness should do. It should not let a character teleport from kitchen to store. It should make the character handle objects. It should care about door state. It should know that keys matter before driving.
But domestic action has a nasty property: many of its steps look nearly identical. Checking a door, closing a door, locking a door, confirming a door is locked, turning away from a door, then realizing you still need keys from beside the door. To a frame-by-frame model, these are all plausible neighboring states.
Run note: The final objective is to confirm the glass back door remains closed and locked, even after several suppressions warn the harness not to keep checking the same door once it already appears secure.
What Broke
The world did not explode. It congealed.
The run became too interested in the local correctness of leaving. Every individual action made sense. Of course you need the keys. Of course the back door should be locked. Of course the phone should be pocketed. Of course a person leaving the house may turn through the kitchen. The problem is that all of those reasonable beats kept pulling the system back into the same domestic orbit.
This is the precondition loop. The harness has enough memory to know what should be true before John leaves, but not enough confidence to retire those facts once they have been handled. The result is a kind of generated checklist anxiety.
There is a deeper issue here too: object state and narrative state are not the same. A door can be visually closed while the narrative state still wants confirmation. A key can be implied by a pocket but not visibly tracked. A phone can be logically pocketed while the model keeps finding reasons to put it back in the hand because phones are visually useful and narratively convenient.
Why It Is Interesting
This is one of the reasons I like John’s World as a testbed. The failures are small enough to study. A spectacle-heavy world would hide this under scale. A space station would let the model get away with a thousand glowing panels. A suburban kitchen is rude. It asks whether the door is the same door as five frames ago.
The kitchen loop shows that continuity is not only about preserving visual identity. It is about preserving permission. When is John allowed to stop checking? When is an object fact stable enough to become background? When does the harness say, “we are done with this door now,” and mean it?
That is not a cosmetic problem. It is the same class of problem that shows up in agentic software work. Systems can keep revalidating, re-reading, re-planning, re-opening, and rechecking because each individual action is defensible. The failure is in the missing commitment to move on.
Next Harness Change
The next version needs stronger completed-fact locking. Once a door has been visually confirmed and no new evidence contradicts it, the door state should become a stable fact with a cooldown. The harness can still react if the image clearly shows the door open later, but it should not keep spending frames on a solved condition.
Keys need similar treatment. If a key or fob has been pocketed, the harness should track that as carried inventory, even when the object is not visible. Otherwise the model will keep trying to earn the same fact visually, which is how a man ends up stuck in his own kitchen with the posture of someone being very responsible.