[AI Agent News] Jan 11 - Jan 17 2025
A collection of news and announcements from the AI Agent world over the past week (with opining thrown in as an extra bonus): Microsoft, OpenAI, Anthropic, Deloitte.
Microsoft Autogen has a new version, primarily focussed on making it easier to build, monitor and improve applications. The underlying concepts remain largely the same but the framework is moving away from a way to experiment with ideas to something that directly addresses the needs of implementation at scale, including a low-code UI.
Anthropic reported it achieved ISO 42001 certification for responsible AI. “What does that mean?”, I hear you wondering. From the ISO website itself:
The ISO/IEC 42001 standard offers organizations the comprehensive guidance they need to use AI responsibly and effectively, even as the technology is rapidly evolving. Designed to cover the various aspects of artificial intelligence and the different applications an organization may be running, it provides an integrated approach to managing AI projects, from risk assessment to effective treatment of these risks.
“What does that still mean?”, I still hear you say. Anthropic is trying to distinguish itself from other competitors by spending more time ensuring that appropriate governance is in place for its systems. The hard truth, however, is that while a better level of governance is probably a good thing it is not a signal of any underlying change in the fundamental challenges these technologies pose. At least Anthropic is willing to do some of the extra work though. Having gone through ISO processes, none of this is exciting or fulfilling but it can uncover gaps and it does force an organisation to think through at least some issues.
OpenAI has a beta release of scheduled Tasks in ChatGPT. This is definitely very early beta and quite underwhelming. While a few are hailing it as the start of the “agentic” era at OpenAI and certainly it is a small first step, it feels more like someone decided they had to test the basic “hello world” implementation of task capability at scale before they move on to something more inspiring, even if it would inevitably get some bad press.
In the interesting reading department we have this paper from Microsoft on “Lessons from Red Teaming 100 Generative AI Products”. It does not disappoint. What does disappoint, is this report from Deloitte on agents and multi-agent systems. Aside from introducing intractable terminology such as making “cognitive leaps” and “reinventing processes” it then presents the most classic IT support triage scenario as a potential implementation of a multi-agent system! Look, I am all in on the benefits of these technologies but the fact that we seem incapable to temper expectations in actual big complex organisations is going to hinder sensible adoption. Reinventing processes and making cognitive leaps is not going to come through the current wave of multi-agent technologies. It can come if we use these technologies to free people up to think longer-term so they can actually do some re-inventing. Ok, rant over.
That wraps up another week in the agent world. Don’t forget to share and subscribe.