V0.2 Development Log
Overview
V0.2 marks the transition from early prototyping (V0.1) to a faithful reproduction of the Generative Agents paper (Park et al., UIST 2023), followed by significant cognitive architecture enhancements. Where V0.1 explored basic two-agent dialogue with raw LLM calls and discovered fundamental limitations of naive prompt-based memory, V0.2 builds a complete simulation engine with a proper cognitive loop, tile-based world, and a rich medieval fantasy setting — Uva Village in TANAPOCIA.
Development period: April 2 – 3, 2026
Phase 1: Faithful Paper Reproduction
The Core Cognitive Loop
The biggest lesson from V0.1 was that rigid, repetitive dialogue stemmed from a structural problem — not a parameter tuning issue. V0.2 addresses this by implementing the full cognitive architecture from the Generative Agents paper:
-
Perceive — Each agent detects nearby events within the same arena (spatial locality). Every perceived event is scored for "poignancy" (emotional significance, 1–10) by an LLM call, determining how strongly the event should impress memory.
-
Retrieve — Instead of simply grabbing recent memories, V0.2 uses a three-factor scoring system:
- Recency: exponential decay — recent memories score higher
- Relevance: cosine similarity between the current context embedding and memory embeddings
- Importance: the poignancy score assigned at perception time
These three factors are min-max normalized and combined with tunable weights (
recency * 0.5 + relevance * 3.0 + importance * 2.0). This solves the V0.1 problem of always retrieving the same memories — now the retrieval adapts to what the agent is currently doing. -
Plan — A three-level decomposition system:
- Daily plan: broad goals for the day (e.g., "work at the blacksmith shop, have lunch, visit the market")
- Hourly schedule: decompose each daily goal into hour-level actions
- Per-action decomposition: further break each hourly action into 5–15 minute tasks
When an agent encounters another agent or a notable event, the plan module decides whether to engage in conversation, wait, or ignore — enabling organic social interactions rather than forced turn-taking.
-
Reflect — When accumulated importance scores cross a threshold, the agent enters reflection mode:
- Generate focal questions from recent high-importance memories
- Retrieve evidence relevant to each question
- Synthesize higher-order insights stored as "thought" nodes in memory
This creates a hierarchy: raw observations → reflected thoughts → meta-reflections, giving agents progressively deeper self-understanding.
-
Execute — Resolve the current plan's target location to a tile coordinate, run A* pathfinding on the collision grid, and return the next movement step along with an emoji and action description.
-
Converse — Turn-by-turn dialogue with structured knowledge extraction. After a conversation ends, both agents extract key information and store it as new memory nodes.
Memory Architecture
V0.2 implements three distinct memory structures, a major upgrade from V0.1's flat memory stream:
-
Associative Memory: The core storage for
ConceptNodeobjects (events, thoughts, chat records). Each node stores SPO (subject-predicate-object) triples, keyword indexes, and embedding vectors. This replaces V0.1's simple timestamp-based memory list. -
Spatial Memory: A hierarchical tree —
world → sector → arena → game_objects— that gives agents an understanding of the world's geography. Agents know which buildings exist, what rooms are inside them, and what objects are in each room. -
Scratch (Working Memory): Over 40 fields capturing the agent's current state — identity, daily plan, hourly schedule, current action, conversation state, reflection weights, and thresholds. This is the "system prompt" equivalent, but dynamic and updated every step.
Simulation Engine
WorldEnginemanages a global clock and all persona instances- Each simulation step = 10 seconds of in-world time (configurable)
- 144 steps = 1 game day
SimulationRecorderwritesmaster_movement.jsonfor later replay- Simulation and replay are completely separated — the CLI runner (
backend/simulate.py) produces data headlessly; the frontend replays it with playback controls
World: Uva Village
- 25 agents with distinct personas, occupations, relationships, and bootstrap memories
- 140 x 100 tile grid, 32 x 32 px per tile
- 5 map layers: collision, sector, arena, game_object, spawning
- 285 named location addresses
- Medieval fantasy setting in the world of TANAPOCIA
Phase 2: Infrastructure & Tooling
Tiled Map Editor Integration
V0.1 had no proper map editing workflow. V0.2 introduces a unified pipeline using the Tiled Map Editor:
- Visual map creation with CuteRPG pixel art tileset
- Automatic extraction of collision, sector, arena, and game_object layers from Tiled
.tmjfiles - A spawning layer system for defining initial agent positions
- Functional layers (display vs. data) cleanly separated
This means new worlds can be designed visually rather than editing CSV files by hand.
World Version Management
A registry system (world_registry.json) that binds each simulation experiment to a specific world version (scene + version pair). This ensures reproducibility — you can always trace which map and agent configuration produced a given simulation run.
Prompt Externalization
All 24 LLM prompt templates were extracted from inline Python strings to external template files (backend/data/prompts/). This makes prompts:
- Editable without touching code
- Versionable and diffable
- Shareable across modules
Test Suite
183 tests covering:
- Unit tests: each cognitive module tested in isolation with mock LLM
- Integration tests: cross-module interface tests
- Regression tests: backward compatibility with the original Smallville world data
Frontend: React + Phaser 3 Replay Viewer
A complete rewrite from scratch:
- React 19 for UI (playback controls, persona list, state inspection)
- Phaser 3 for tile map rendering and sprite animation
- Instant replay from
master_movement.jsonwith play/pause/speed controls - WebSocket support for live simulation streaming
Phase 3: Cognitive Architecture Enhancements (Toward ALICEv1)
This is where V0.2 diverges from the original paper and begins building toward the ALICEv1 vision.
Memory Split: Short-term vs. Long-term
The original paper treats all memories equally. V0.2 introduces a biologically-inspired split:
- Short-term memory: recent events within a configurable time window, readily accessible
- Long-term memory: older memories that have been consolidated, requiring stronger retrieval signals to surface
This means agents don't treat a conversation from 3 days ago with the same immediacy as something that happened 5 minutes ago — a subtle but important step toward realistic cognitive behavior.
Dream Module (Memory Consolidation)
Inspired by the role of sleep in human memory consolidation:
- When an agent "sleeps" at the end of a simulated day, the Dream module activates
- It reviews the day's significant events, consolidates important memories into long-term storage
- It can also trigger Ego evolution — updating the agent's self-concept based on accumulated experiences
This addresses one of V0.1's core complaints: that agents are "frozen in an instant." While we still can't update LLM weights, we can evolve the agent's identity, goals, and self-understanding through the Dream cycle.
Ego Evolution
Each agent has an Ego — a structured self-concept including identity, values, and goals. The Dream module can propose updates to the Ego based on significant experiences:
- An agent who repeatedly fails at a task may lower its confidence
- An agent who discovers new information may update its worldview
- Social interactions can shift relationship attitudes
This is a step toward the "shapeable values" concept from the SAO/Alice inspiration.
Ability Check System
Not every agent can do everything. V0.2 introduces LLM-based ability validation:
- Before executing an action, the system checks whether the agent's skills, age, and physical condition allow it
- A child cannot forge a sword; an elderly scholar cannot run long distances
- Checks are calibrated to medieval-era standards (relaxed from modern expectations)
This adds a layer of realism and prevents absurd behaviors that break immersion.
Knowledge Enhancement & Scene Injection
- World knowledge system: agents can possess different subsets of world knowledge (common sense, history, geography, culture, morality, rules) with different mastery levels
- Scene injection: environmental descriptions are injected into the agent's perception based on their current location, time of day, and weather — making agents aware of and responsive to their surroundings
Ebbinghaus Forgetting Curve
Memories now decay following a curve inspired by Ebbinghaus's forgetting research:
- Unreinforced memories gradually lose retrieval strength
- Memories that are repeatedly accessed or emotionally significant decay more slowly
- The forgetting is deterministic (reproducible across runs) rather than random
This prevents the "perfect memory" problem where agents remember every trivial detail forever.
Dissent & Rebellion Mechanism
A unique addition reflecting the TANAPOCIA world's themes:
- Agents can develop doubts about established rules or authority
- When internal conflict (between personal experience and imposed beliefs) exceeds a threshold, agents may begin to question or resist
- This creates the potential for organic social dynamics — heretics, reformers, rebels — emerging from individual cognitive processes rather than scripted events
Technical Improvements
- All magic numbers extracted to
backend/constants.py— no hardcoded values in cognitive modules - Address parsing centralized in
backend/address.py(replacing fragile string-slice operations) - Save-file migration system (
backend/migration.py) for backward compatibility when new fields are added - LLM client strips
<think>...</think>tags from Qwen3 output automatically - Module-boundary contracts defined in
backend/interfaces.pyusing dataclasses
What's Next
V0.2 establishes the complete simulation infrastructure and begins the cognitive enhancements that will define ALICEv1. The road ahead includes:
- Emotion system: multi-dimensional emotional state that influences perception, planning, and social interaction
- Relationship dynamics: trust, affection, rivalry evolving through repeated interactions
- Player intervention: allowing a human to "dive in" and interact with the simulated world
- Larger worlds: scaling beyond 25 agents to hundreds, testing emergence at scale
- Continuous learning exploration: the long-term dream of evolving the LLM itself, as inspired by the Alice concept from Sword Art Online