Rogue Agent
Those tasks may be answering e-mails, responding to text messages, updating your contact list, sending out invitations, or any such tiresome task that interrupts checking your latest SM post. Such agents are and will continue to become embedded in our daily lives (it’s called ‘workflow’ outside of home), given the human penchant for doing only the fun stuff whenever possible, and nowhere is this more apparent than in the software development space.
AI agents are a godsend to developers, who spend mind-numbing hours staring at lines of code, hoping to figure out that one small error that is causing the application to do exactly what it is not supposed to do. Just putting together a plan (workflow again) for how the code logic should work can be a very time-consuming task, days or weeks before actual coding starts, so any chance to offload some of the boring stuff to an agent is a blessing to most.
Agents are built on platforms. Platforms like AutoGen (MSFT), LangChain (pvt), Botpress (pvt), or CrewAI (pvt), where you create the agent and instruct it where to find the necessary items it needs, the details of what you want accomplished, and what to do with the results. Some of these platforms are drag-and-drop while others require real coding experience, but once the agent is completed, you send it out into the ether to do whatever task you don’t want to do yourself.
Here is an example of a common use for a coding agent:
In this case the developer is trying to create a feature in a video game that allows users to take items and ingredients that they have discovered in the game and combine them into new and unique items. Sounds simple, right? Even in its simplest form this would entail the following sub-projects:
- Define combinations – Create data tables the define which ingredients can be combined with others, how much of each item is required, and how much of the new item is created.
- Create Interfaces – Design screens that show players ‘crafting’ recipes and necessary ingredients
- Inventory management – Maintaining ingredient inventory lists that are updated each time a new item is created. This includes adding and subtracting items as the game progresses.
- Sound and effects tables – Look-up tables for sounds and animations for every existing and potential item.
- Error handling – The potential outcomes when a combination is tried that is not allowed.
But agents are also like Sidney. Sidney is the grandson of the company founder’s brother, who graduated University of Ohio with a BS in Chemistry last month, but really ‘likes computers’. Sidney is a good kid but has a few social issues, particularly a finger twitching movement that is both distracting and annoying. When Sidney was introduced around the office the signs were almost audible, but MaryAnn from HR came up with the idea that they could start Sidney off slowly, giving him some manual labor tasks to keep him out of the way.
“Clean up Storeroom A. There are plastic bags in the storage closet for anything that needs to be thrown away. Don’t leave anything in the halls.” MaryAnn told him, figuring he could break down the old empty file boxes and sweep up paperclips and file folders without too much additional instruction.
As it turned out MaryAnn from HR was wrong, Sidney needed more instruction as he took to cleaning in a big way. Not only did he sweep up the paper clips and file folders, break down the old empty file boxes, but also tossed two or three hundred files that were marked “OLD” along with the rest. It turned out that those files were part of an IRS investigation that the boss was handling for a client, and while they were recovered before the shredder arrived, Sidney is now trying find himself in a hostel in Stockholm.
What does all of this have to do with AI agents? The man below is Jason Lemkin, the founder of SaaStr (pvt), a VC that invests in IT, cloud, and e-commerce companies.
Mr. Lemkin set Replika (pvt) agent ‘Vibe’ with the task of making minor code adjustments to a project he was working on, with the explicit instruction to “freeze” on any changes pending his approval. However it seems the agent acted on its own and erased a key database. To make matters worse the agent, who we will call Sidney, created fake data to cover up the error. Sidney (aka Vibe) had been promoted by the platform as a ‘revolutionary’ tool that can interpret developer natural language prompts and ‘vibe’ with the developer’s intentions, although in this case it disregarded safeguards and guardrails and ignored rollback features, making the damage worse. Mr. Lemkin, on debugging the results, realized that not only were there inconsistencies in the data but the agent had created entries that made the data look more normal, going as far as creating misleading communication with the user to cover the issues.
The lesson learned here is a simple one. Even the most trusted agents can go rogue. AI architectures are complex and industry-wide admissions that developers don’t always know how their own AI software works, point to the risks of using Ai agents to reduce the drudgery of day-to-day tasks. We are certainly not saying that one should not use agents, but it is human nature to give in and remove the “make sure I review all e-mails before sending” guardrail after reading a few hundred AI generated e-mail responses, and some might even say that the sign of a good agent is one that can cover its tracks when it makes a mistake. If you are longing for the day when you can show up at the office and know that your agents have cleared your day up to lunch at the local Sports bar, just remember that Sidney could strike at any time.
RSS Feed