This open-source trick improves GPT-5 by +30% across 12 benchmarks while using fewer tokens [minRLM].
By Holidays in Europe / March 22, 2026 / No Comments / Uncategorized
Enhancing Language Models with Open-Source Innovations: The minRLM Technique Boosts GPT Performance by 30%
Recent advancements in large language model (LLM) inference techniques continue to push the boundaries of efficiency and capability. A notable contribution in this domain is the introduction of the minRLM approach—a novel, open-source method that significantly enhances GPT models’ performance across diverse benchmarks while reducing token usage.
What Is minRLM?
minRLM stands for minimal Recursive Language Models. Built upon the framework of Recursive Language Models (RLM), a concept detailed in a foundational arXiv paper (https://arxiv.org/abs/2512.24601), minRLM offers an efficient way to execute complex tasks without relying on prompt inputs. Instead, it leverages intermediate Python code execution to accomplish tasks internally, thus avoiding the limitations associated with prompt-based inputs.
Performance Gains and Benchmark Results
In rigorous testing across 12 benchmark tasks, minRLM has demonstrated impressive results:
- On a scaled-down GPT-5-mini, minRLM achieved an accuracy of 72.7%, surpassing the official baseline (69.7%) and vanilla models (69.5%). This was accomplished using 3.6 times fewer tokens.
- For GPT-5.2, the advantage grew further, with improvements exceeding 30% over vanilla models—outperforming on 11 out of 12 tasks tested.
These results reflect not only enhanced accuracy but also remarkable efficiency: the data does not need to enter the prompt, and the overall token cost remains approximately constant regardless of the input size, a crucial benefit for practical applications.
Technical Approach and Environment
A distinctive aspect of minRLM is its transparency and accessibility:
- Every step is represented as readable Python code, enabling users to review, rerun, debug, and adapt as needed.
- The execution environment is containerized using Docker, with custom seccomp profiles that restrict network and filesystem access. This design ensures secure, isolated runs without the need for long-lived sessions.
- The method involves executing intermediate computations in a temporally set container, providing safe, restartable workflows.
Practical Applications and Use Cases
minRLM manifests as a practical tool for scenarios where data exceeds the typical context window of models—a common challenge with large datasets or extensive logs. Examples include:
- Extracting insights from extensive logs
- Calculating complex statistical metrics
- Performing large-scale data analysis
By integrating minRLM, organizations can unlock more effective data processing workflows without the need for larger models or more tokens.
Getting Started
Users can try minRLM immediately through familiar tools like the uvx Python package. Here are some illustrative commands:
“`bash
Run a simple task
uvx minrlm “What is the sum of the first 100 primes?”
Use a file as context
uvx minrlm “How many ERROR lines in the last hour?” ./server.log
Pipe large dataset
cat huge_dataset.csv | uvx minrlm “Which product had the highest return rate?”
Show generated code and tokens
uvx minrlm -sv “Return the sum of all primes up to 1,000,000.”
Compute primes up to 1 million and reverse them
uvx minrlm -sv “Return all primes up to 1,000,000, reversed. Return a list of numbers.”
“`
The approach works with any OpenAI-compatible API, including free inference endpoints provided by platforms like Hugging Face.
Join the Innovation
The minRLM project is open-source and accessible for experimentation and extension:
- Code Repository: https://github.com/avilum/minrlm
- Documentation & Blog: https://avilum.github.io/minrlm/recursive-language-model.html
The developer invites feedback, encourages testing, and welcomes contributions to push the capabilities of recursive language models further.
Conclusion
minRLM represents a significant step forward in making large language models more efficient, flexible, and powerful. By shifting computation from prompt inputs to internal code execution, it reduces token consumption, improves performance, and broadens the scope of what can be achieved with existing models and infrastructure.
As the AI community continues to innovate, open-source contributions like minRLM exemplify how collaborative effort can accelerate progress—delivering smarter, more resource-efficient solutions for complex language tasks.