42 Researchers Release 100-Page Report: Code Is the Core of AI Agent Engineering
A 100-page joint review authored by 42 researchers from institutions including UIUC, Meta, and Stanford proposes the unified framework of "Code as Agent Harness". This work, for the first time, positions code as the core infrastructure of AI agents, puts forward four core properties for future agent systems, and sorts out the complete architecture and deployment scenarios from single-agent to multi-agent systems. It is currently one of the most systematic overviews in the field of agent engineering.

On May 18, 2026, a review paper with a total length of more than 100 pages was uploaded to arXiv, co-authored by 42 researchers from institutions including the University of Illinois Urbana-Champaign, Meta, and Stanford University. An AI researcher recommended the paper on social media, noting that its core topic addresses a long-overlooked core foundation of the AI Agent field: code.
In past discussions on the technological evolution of AI Agents, most industry attention has focused on segmented directions such as planning logic, memory modules, and tool-use capabilities. Rarely has code itself been regarded as the core infrastructure of the entire agent system. This paper directly sorts out recent industry changes: the code capabilities of large language models have long gone beyond generating runnable business snippets. In today's agent systems, code has transformed from a mere output product into a unified operational foundation for agent reasoning, action, environment modeling, and execution verification.
The paper defines this new framework as "Code as Agent Harness" and splits the entire system into a three-layer progressive structure:
1. Interface layer: Code is responsible for connecting the complete chain of agent reasoning, action, and environment modeling
2. Mechanism layer: Covers the planning, memory, and tool-use capabilities required for long-cycle execution, and is equipped with feedback-driven control and optimization logic to ensure the reliability and adaptability of the system
3. Multi-agent extension layer: Based on shared code artifacts, it supports collaboration, review, and verification capabilities among multiple agents
The paper also sorts out representative methods and deployment scenarios in this field, covering seven core areas: programming assistants, GUI/OS automation, embodied agents, scientific discovery, personalized recommendation, DevOps, and enterprise workflows. Regarding current technical bottlenecks, the paper also lists several core unsolved challenges: evaluation cannot rely solely on final task success rate, verification under incomplete feedback, framework iteration without regression, consistent state synchronization among multiple agents, human supervision for safety-critical operations, and capability expansion in multimodal environments.
The core conclusion reached by the paper is that future agent systems must satisfy four core properties: executable, inspectable, stateful, and governable. As the core framework of agents, code is very likely to be the key path to推动 pushing the maturation of Agent engineering.
Related Resources:
- Original paper: [arXiv:2605.18747](https://arxiv.org/abs/2605.18747)
- Curated repository of related papers: [Awesome-Code-as-Agent-Harness-Papers](https://github.com/YennNing/Awesome-Code-as-Agent-Harness-Papers)
If you want to learn how to build practical AI Agents hands-on, DAIR.AI Academy offers relevant practical courses covering agent development, Claude Code application, RAG, large model fine-tuning and other directions. The courses have free and paid tiers, and are used by developers from Google, Meta, Microsoft and other companies.
发布时间: 2026-05-20 08:49