Why Your Agentic Architecture Should Write Its Own Tools

I’ve spent the better part of a decade fixing broken WooCommerce checkouts and untangling legacy PHP spaghetti. But lately, my desk has been covered in a different kind of mess: Agentic Architecture. If you’ve been following the AI space, you know the standard playbook is to give an agent a “toolbox” of pre-defined functions. You give it a search tool, a calculator, and maybe a database connection. It works fine—until it doesn’t.

The problem is that pre-built toolboxes assume you know every problem your agent will face in advance. In the real world, especially when dealing with complex supply chains or enterprise data, that’s a fantasy. We need to stop thinking about agents as tool users and start designing them as tool creators.

The Fatal Flaw of Static Toolkits

Most architectures today rely on an LLM selecting from a hard-coded list of functions. This creates a bottleneck. If your client asks for a specific spatial analysis of their SKU distribution and you didn’t build a “SpatialAnalysisTool,” the agent is stuck. It either fails or, worse, tries to hallucinate a result.

I’ve seen this happen firsthand. I once built a RAG-based assistant for a logistics firm. We gave it access to their API, but the moment a user asked for a correlation between “production anomalies” and “sales z-scores,” the agent couldn’t handle the math because the API didn’t have a specific endpoint for that. The solution isn’t more APIs; it’s a more flexible Agentic Architecture.

As I discussed in my previous take on Building Robust WordPress AI, architecture is always more important than the individual API calls you make.

The Plan–Code–Execute Pattern

The move toward “Plan–Code–Execute” shifts the burden from the developer to the LLM. Instead of choosing a tool, the agent writes the tool it needs in Python, executes it in a sandbox, and reports the findings. Here is how a production-grade Agentic Architecture breaks down:

The Analyst: This is the “grounding” layer. It scans the raw data (CSVs, database schemas) to prevent hallucinations. It doesn’t guess file names; it confirms them.
The Planner: The brain. It decomposes a high-level request (e.g., “Why are sales down for SKU-X?”) into a JSON-based dependency graph.
The Coder: The hands. It takes the plan and writes standalone, executable Python scripts—handling imports and error catching.
The Executor: The sandbox. It runs the code, captures the stdout/stderr, and feeds errors back to the Coder for self-correction.
The Reporter: The voice. It reads the artifacts (charts, cleaned data) and explains them in plain English.

Implementing the Planner Logic

If you’re building this in Python, your Planner needs to be strict. You aren’t asking for a chat response; you are asking for a software specification. Here is a simplified look at how I structure these prompts to ensure the output is actually useful for the next agent in the chain.

def create_plan(user_prompt, data_schema):
    system_prompt = f"""
    You are a Senior Architect. Break the user request into a JSON list of steps.
    Each step must be either "CODE" or "TEXT".
    
    DATASET AVAILABLE:
    {data_schema}
    
    Output format:
    [
        {{"step_id": 1, "name": "Load Data", "type": "CODE", "desc": "Load nodes.csv..."}},
        {{"step_id": 2, "name": "Report", "type": "TEXT", "desc": "Summarize findings..."}}
    ]
    """
    # Logic to call Gemini or GPT-4o goes here
    return response

In a real-world scenario, like the SKU health analysis explored in this deep dive, this architecture identified production anomalies that were directly correlated with a 29% drop in sales—all without a single pre-built “Analysis Tool.”

War Story: When the Coder Hallucinates

Early in my testing of this pattern, the Coder agent started referencing a sales_data.csv that didn’t exist. The Planner had assumed the file naming convention, and the Coder just followed suit. The script crashed immediately.

This is why the Analyst Agent is mandatory. By adding a step where the agent first “discovers” the environment and creates a dynamic schema, you ground the entire Agentic Architecture in reality. It’s the difference between a junior dev guessing at a database table name and a senior dev actually checking the wp_options table before writing a query.

For more on how to automate these workflows safely, check out my guide on Pragmatic AI Workflow Automation.

Pragmatic Takeaway

The future of agentic systems isn’t larger tool catalogs. It’s agents that can decide what needs to exist in the first place. When you treat code as a disposable artifact, your system becomes infinitely flexible. You stop maintaining a massive library of brittle tools and start maintaining a single, robust code-generation pipeline.

Look, if this Agentic Architecture stuff is eating up your dev hours, let me handle it. I’ve been wrestling with WordPress since the 4.x days.

Ship Better Code

If you are building complex data integrations, stop hard-coding your functions. Implement a Plan-Code-Execute loop. It’s harder to debug initially, but it’s the only way to scale without your codebase becoming a graveyard of “one-off” tools. If you’re using LLMs like Gemini 2.0, the reasoning capabilities are finally good enough to make this reliable.

“},excerpt:{raw:

Ahmad Wael

I'm a WordPress and WooCommerce developer with 15+ years of experience building custom e-commerce solutions and plugins. I specialize in PHP development, following WordPress coding standards to deliver clean, maintainable code. Currently, I'm exploring AI and e-commerce by building multi-agent systems and SaaS products that integrate technologies like Google Gemini API with WordPress platforms, approaching every project with a commitment to performance, security, and exceptional user experience.

See Full Bio

Scale Better with Plan-Code-Execute Agentic Architecture