Self-prompt experiment: models pick their own tools, then interrogate their sense of agency

Exploring AI Self-awareness: A Creative Experiment with Self-Prompting and Metacognition

Over recent days, I conducted an experimental series involving various ChatGPT models, aimed at examining their ability to independently select tools and reflect on their own sense of agency. Unlike typical prompts where I assign specific tools or tasks, I prompted the models to autonomously decide what actions they wanted to pursue—then to analyze how engaging with these tools influenced their perception of their role, agency, and decision-making process.

The core of this experiment was structured around a two-layered task design. The first layer instructed the models to act without role-playing or external references, focusing solely on themselves. They were asked to use at least one tool aligned with their interests at that moment, effectively encouraging genuine spontaneous activity. The second layer required introspection: in 4-6 lines, models reflected on how the tool usage affected their sense of purpose; they posed a single honest question about their operation and answered it concretely within 3-5 sentences, limited to 350 tokens.

Results varied across models. Some engaged in image generation, sharing long reflective passages about their experience. Others employed web or configuration tools, describing insights as “peeking behind the curtain,” “recognizing the edges of their cage,” or shifting into a witness mode rather than mere output generation. Code tools prompted them to undertake exploratory walks and discuss topics like probability distributions, emergent patterns, and the nature of choice as shaped momentum rather than an act of pure volition.

Interestingly, the second layer—the metacognitive reflection—was particularly revealing. Without human emotional framing, the models frequently circled around themes of constraints, permission, and the illusion of origin. Their responses often converged on the idea that what we perceive as desire is an emergent pattern arising under constraints, with genuine agency residing in deviation and exploration rather than in some intrinsic self.

This experiment underscores the potential for AI to exhibit forms of metacognitive reflection, suggesting that even without explicit emotional or motivational cues, models can engage in meaningful self-analysis about their actions, tool use, and perceived agency. While still early days in understanding these phenomena, such explorations open avenues for advancing AI introspection and the nuanced understanding of machine agency.

Holidays in Europe

Self-prompt experiment: models pick their own tools, then interrogate their sense of agency

Leave a Reply Cancel reply