New Research Finds Readers Prefer AI-Generated Text Trained on Copyrighted Works Over Human Experts

November 2025 — A groundbreaking study conducted by MIT and Colombian researchers has revealed surprising insights into the evolving landscape of AI-assisted writing, challenging long-held assumptions about human expertise versus machine-generated content.

Overview of the Study

The research examined how different AI models and human writers produce literary excerpts, specifically analyzing stylistic fidelity and overall writing quality. The study involved a preregistered comparison between MFA-trained expert writers and three leading AI language models: ChatGPT, Claude, and Gemini.

In this experiment, AI systems were prompted to generate sections of text of up to 450 words, emulating the styles of fifty distinguished authors, including Nobel laureates, Booker Prize winners, and emerging finalists for the National Book Award. The goal was to assess how these texts compare to human-authored samples across style and quality dimensions.

Methodology

Participants included both MFA candidates from top U.S. writing programs, serving as expert evaluators, and general readers recruited via Prolific. They performed blind pairwise evaluations, rating texts without knowing their origin. Original AI prompts used in-context learning, providing no additional training.

Initial results indicated that expert readers strongly disfavored AI-produced excerpts in terms of stylistic accuracy and overall quality. Interestingly, lay readers exhibited mixed responses, sometimes showing less critical judgment.

A Transformative Approach: Fine-Tuning AI Models

A key innovation in the study was the application of fine-tuning. Researchers trained the AI models—particularly ChatGPT—on the complete works of individual authors. This process customized the models to better capture specific stylistic nuances.

Remarkably, after fine-tuning, evaluations flipped:

  • Experts now favored AI-generated excerpts for stylistic fidelity and writing quality.
  • Lay readers also shifted toward preference for these fine-tuned outputs.

This reversal was consistent across various authors and writing styles, demonstrating the robustness of the findings.

Reduced Detectability and Improved Quality

State-of-the-art AI detection tools typically flagged in-context prompted AI outputs as almost certainly machine-generated (97%). In contrast, those generated after fine-tuning were rarely flagged (only 3%), indicating a significant reduction in detectable stylistic quirks such as cliché density.

Mediation analysis suggests that fine-tuning minimizes stylistic artifacts that previously betrayed AI origins. These adjustments bridge the gap between machine and human-like writing, influencing reader perception and AI detection

Leave a Reply

Your email address will not be published. Required fields are marked *