Review of State of Agent Engineering Report

Photo by editor

# Introduction

Lang ChinaOne of today’s leading frameworks for building and orchestrating artificial intelligence (AI) applications based on large language models (LLMs) and agent engineering was recently released. State of Agent Engineering Reportwhich surveyed 1,300 professionals from diverse roles and business backgrounds to uncover the current state of this remarkable AI phenomenon.

This article selects some of the top picks and insights from the report and elaborates on them in a tone accessible to a wider audience, uncovering some of the key terms and terminology related to AI agents. You can learn more about the key concepts behind AI agents in this related article.

Before focusing on the facts, figures, and supporting evidence for each of our top three selected insights, we provide some key terms and definitions to know, briefly explained:

# Large enterprises are pushing startups into production.

Key concepts to know:

Agent: An AI system that, unlike standard chat-based applications that reactively respond to user interactions, is capable of making decisions and taking actions on its own. In its most used contexts today, agents use LLM as their “brain,” fueling decision-making on which next step to take—for example, querying a database, sending email, or searching the Web—to accomplish a goal.
Production (Environment): Although this is a fundamental concept in software engineering, it may seem unfamiliar to readers from other backgrounds. Being “in production” means that a software system is live, and actual customers, users, or employees are using it to perform some task or action. This is essentially what a prototype or proof-of-concept (PoC) follows: a test version of the software that has been run in a controlled environment to identify and fix potential problems.

Key facts in the report:

While there’s a common “red tape” misconception that large companies are slow to adopt new technology, the statistics show something different: They’re leading the charge in AI agent deployment, with 67% of organizations with more than 10,000 employees putting agent-based applications into production and only 50% of smaller organizations with fewer than 100 employees doing so.
Reasons for the above point may include the cost of building a trusted agent solution, which requires significant investment in infrastructure.

Similar evidence can be found in Deloitte’s 2026 State of AI in the Enterprise And McCain’s State of AI in 2025 Reports

# Observability versus evaluability distinction

Key concepts to know:

observation: AI models, especially advanced models, are often viewed as ambiguous “black boxes” with unpredictable results. Observability is the ability to observe and record what the AI thinks and how it leads to decisions or results.
Tracing: A special aspect of observation, which involves recording the AI agent’s step-by-step journey – that is, its reasoning path.
Offline assessment: It consists of running through a test dataset with known “correct” answers to evaluate how accurately and efficiently an AI agent (or other AI system) performs.

Key facts in the report:

A staggering 89% of respondents from all backgrounds have implemented monitoring procedures, although only 52.4% are conducting offline evaluations, revealing a significant gap between how teams monitor AI agents and how rigorously they evaluate their performance.
This indicates a “ship and watch” mentality, in which engineering teams prefer to debug errors after they occur rather than prevent them before deployment to production. Fixing “broken robots” instead of making sure they work properly before they leave the “factory” can have undesirable consequences and costs.

Similar evidence can be found in Gascard’s LLM Observation vs Evaluation Essay

# Cost is no longer the main constraint: quality is.

Key concepts to know:

Hallucinations: When an AI model such as LLM reliably produces false or nonsensical information as if it were true, it is called falsification. This is a dangerous problem when AI agents run into it because the problem is not only about saying the wrong thing, but also potentially doing the wrong thing — for example, booking a flight based on incorrect or incorrectly retrieved facts.
delay: refers to the speed or delay between the user asking a question and receiving an answer provided by an agent “thinking” or processing logic, often involving the use of tools. This adds to the extra time involved compared to standalone LLMs or chatbots.

Key facts in the report:

The cost of deploying AI agents is no longer a major concern, according to respondents, with 32% citing quality as the biggest barrier to adoption and deployment.
Quality in this context refers to accuracy, consistency and avoidance of illusions.
Meanwhile, there’s an interesting catch: The second most important barrier varies by company size, with smaller startups citing delays and enterprises with more than 2,000 employees pointing to security and compliance.

Similar supporting evidence can be found in the previously mentioned. Barriers to AI adoption report. from Deloitte, while key evidence about the top enterprise blockers can be further analyzed in Medium Essay.

Iván Palomares Carrascosa He is a leader, author, speaker, and consultant in AI, Machine Learning, Deep Learning and LLMs. He trains and guides others in using AI in the real world.

# Introduction

# Large enterprises are pushing startups into production.

# Observability versus evaluability distinction

# Cost is no longer the main constraint: quality is.

Editor's pick

Get latest news

Review of State of Agent Engineering Report

# Introduction

# Large enterprises are pushing startups into production.

# Observability versus evaluability distinction

# Cost is no longer the main constraint: quality is.

How higher education is changing India under NEP 2020

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news