Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
AI agents have safety and reliability problems. Although agents would allow enterprises to automate more steps in their workflows, they can take unintended actions while executing a task, are not very flexible and are difficult to control.
Organizations have already sounded the alarm about unreliable agents, worried that agents might forget to follow instructions once deployed.
OpenAI even admitted that ensuring agent reliability would involve working with outside developers, so it opened up its Agents SDK to help solve this issue.
However, Singapore Management University (SMU) researchers have developed a new approach to solving agent reliability.
AgentSpec is a domain-specific framework that lets users “define structured rules that incorporate triggers, predicates and enforcement mechanisms.” The researchers said AgentSpec will make agents work only within the parameters that users want.
Guiding LLM-based agents with a new approach
AgentSpec is not a new LLM but rather an approach to guide LLM-based AI agents. The researchers believe AgentSpec can be used not only for agents in enterprise settings but useful for self-driving applications.
The first AgentSpec tests integrated on LangChain frameworks, but the researchers said they designed it to be framework-agnostic, meaning it can also run on ecosystems on AutoGen and Apollo.
Experiments using AgentSpec showed it prevented “over 90% of unsafe code executions, ensures full compliance in autonomous driving law-violation scenarios, eliminates hazardous actions in embodied agent tasks, and operates with millisecond-level overhead.” LLM-generated AgentSpec rules, which used OpenAI’s o1, also had a strong performance and enforced 87% of risky code and prevented “law-breaking in 5 out of 8 scenarios.”
Current methods are a little lacking
AgentSpec is not the only method to help developers bring more control and reliability to agents. Some of these approaches include ToolEmu and GuardAgent. The startup Galileo launched Agentic Evaluations, a way to ensure agents work as intended.
The open-source platform H2O.ai uses predictive models to make agents used by companies in the finance, healthcare, telecommunications and government more accurate.
The AgentSpec said researchers said current approaches to mitigate risks like ToolEmu effectively identify risks. They noted that “these methods lack interpretability and offer no mechanism for safety enforcement, making them susceptible to adversarial manipulation.”
Using AgentSpec
AgentSpec works as a runtime enforcement layer for agents. It intercepts the agent’s behavior while executing tasks and adds safety rules set by humans or generated by prompts.
Since AgentSpec is a custom domain-specific language, users need to define the safety rules. There are three components to this: the first is the trigger, which lays out when to activate the rule; the second is to check to add conditions and enforce which enforces actions to take if the rule is violated.
AgentSpec is built on LangChain, though, as previously stated, the researchers said AgentSpec can also be integrated into other frameworks like AutoGen or the autonomous vehicle software stack Apollo.
These frameworks orchestrate the steps agents need to take by taking in the user input, creating an execution plan, observing the result,s and then decides if the action was completed and if not, plans the next step. AgentSpec adds rule enforcement into this flow.
“Before an action is executed, AgentSpec evaluates predefined constraints to ensure compliance, modifying the agent’s behavior when necessary. Specifically, AgentSpec hooks into three key decision points: before an action is executed (AgentAction), after an action produces an observation (AgentStep), and when the agent completes its task (AgentFinish). These points provide a structured way to intervene without altering the core logic of the agent,” the paper states.
More reliable agents
Approaches like AgentSpec underscore the need for reliable agents for enterprise use. As organizations begin to plan their agentic strategy, tech decision leaders also look at ways to ensure reliability.
For many, agents will eventually autonomously and proactively do tasks for users. The idea of ambient agents, where AI agents and apps continuously run in the background and trigger themselves to execute actions, would require agents that do not stray from their path and accidentally introduce non-safe actions.
If ambient agents are where agentic AI will go in the future, expect more methods like AgentSpec to proliferate as companies seek to make AI agents continuously reliable.
Source link