Home » Article » A Test-Driven Approach to Building Better Agents

< Back to all posts

A Test-Driven Approach to Building Better Agents

By Manjeet Singh | March 17, 2025

When you first create an agent—like a service agent for support queries—it can be tempting to deploy it right away. However, by following a test-driven development (TDD) approach, you write tests first, set clear expectations, and then build your agent to pass these tests. This process, combined with creating a robust test dataset and running evals, ensures your agent handles real-world scenarios reliably, making life easier for Salesforce Admins.

Benefits of test-driven agent development

Accuracy and quality assurance: Write tests first to define what success looks like, then build the agent to meet these criteria.
Consistent updates: Validate each new feature with tests (evals) before it goes live, ensuring that every update maintains quality.
Maintainability: A well-defined test dataset and clear eval criteria make it easier to add or update features without unexpected errors.

What does this mean for admins?

Salesforce Admins often rely on quick fixes and minimal manual testing. However, using the Agentforce Testing Center with a TDD approach helps you catch issues early and iterate quickly without breaking existing features. This means smoother operations and fewer disruptions in your daily workflows.

A 4-step approach to building a better test dataset and evals

Before you begin testing, create a baseline version of your agent—a “first draft.” For example, a service agent that handles shipping questions and returns, or escalates complex cases to a human support team.

Step 1: Understand Agentforce Testing Center

Agentforce Testing Center is your hub for creating and running tests. It supports group and batch testing, as well as artificial intelligence (AI)-assisted test case generation, and provides both prebuilt and custom evals.

It links directly to your agent configuration and gives you real-time feedback (Pass/Fail reports) to know exactly where improvements are needed.

Step 2: Generate a high-quality, diverse test dataset

What is a test dataset? It’s a collection of simulated interactions that your agent is expected to handle, including common queries and edge cases.

A robust test dataset is essential for evaluating your agent’s performance in realistic conditions.

How to build it:

Identify key topics (for example, Case Creation, Order Tracking).
Identify what data sources Agents can use through Agenforce Data Library (Knowledge articles, files, web search)
List realistic scenarios (for example, multiple orders, invalid data)
Think about different user persona and see which scenarios are applicable to them
Guardrails and edge cases: Test cases for negative testing scenarios.
Use the AI-assisted tool in Agentforce Testing Center to generate draft test cases, then manually add cases to cover any gaps.
Example: For a service query like, “Where’s my order?”, ensure you include variations such as multiple orders or typos.
Think about scenarios where you have ground truth available or not available.

Build test data using AI:

The good news is that Agentforce Testing Center provides an AI-assisted way to create your draft dataset that you can review and refine. Here are the steps.

If you’re in Agent Builder, click Batch Test. If you’re in the Agentforce Testing Center, click New Test.
Select Generate Test Cases – provide a name and use “describe the test cases and provide examples” to create a diverse set of test cases.
Enter number of test cases you want to create.
Select topics that you want to test and click Generate Test Cases. This will take few minutes and will auto create a diverse set of test cases using AI/LLM.

Add new test cases in the dataset to cover all important topics and scenarios

With your test dataset ready, the next step is to add new test cases that are not covered by automated test case generation. In TDD, you actually start with a failing test—this tells you exactly what to fix. Then you improve your agent until the test passes. You should also think about adding context variables or conversation history as state injection to simulate different scenarios.

Step 3: Define evals and run tests

Evals (evaluations) are automated checks that compare your agent’s response against the expected outcome.

In Agentforce Testing Center:

Import your test cases or generate a new test case.
Select the eval (outcome evals, coherence, faithfulness, instruction following).
Click Save & Run to execute the entire suite.
Review the Pass/Fail Report: Check the agent conversation logs to see how it responded to each test.

Pass/Fail Analysis

Pass: The agent responded exactly as expected—no updates needed.
Fail: The agent gave a wrong or incomplete response.
- Adjust your agent instruction, topics, or Knowledge base references.
- Re-run the test until it passes.

Step 4: Iterate and optimize

Optimize your agent’s instructions, topics, and actions based on test results. For example, if a test for “I want to return my order” fails because the agent doesn’t request a valid order number, update the flow to ask, “Could you share the order number you want to return?”
Re-run tests after making changes to confirm improvements.
Establish a routine for regular test reviews to ensure ongoing reliability and performance.
Continuous improvement: Regularly re-run tests and update your dataset as your agent evolves, ensuring long-term reliability.

Build better agents today

By following a test-driven approach—writing tests first, generating a comprehensive test dataset, and using evals to measure success—you build a robust Agentforce Agent. This process not only enhances accuracy and efficiency but also aligns with the everyday needs of Salesforce Admins, ensuring smoother operations and better customer interactions.

Thank you to Senior Director of Product Management Deepak Mukunthu for collaborating on this article.

Resources

Salesforce Admins Blog: Ensuring AI Accuracy: 5 Steps To Test Agentforce
Salesforce site: Customer Success with Agentforce
Salesforce site: Agentforce ROI Calculator
Salesforce site: Agentforce Testing and Deployment
Trailhead: Agentblazer Community Group
Trailhead: Datablazer Community Group

Share this story!

Employee Agents: Your User Support Advantage

Employee Agents and Foundations: The Admin Advantage for User Support

By Carl Gayle | June 23, 2025

If you’re a Salesforce Admin looking for ways to scale support, reduce repetitive work, and make Salesforce easier for your users, employee agents are for you. These built-in artificial intelligence (AI) assistants are now available with Salesforce Foundations (Enterprise Edition or above), and you can get started without extra cost. They’re designed to help your […]

READ MORE

Demo to Deployment: Engaging Stakeholders With Agentforce

By Kate Lessard | June 18, 2025

As Salesforce Admins, we’ve dived headfirst into Agentforce. We’re building agents in Trailhead, attending workshops like Agentforce NOW, doing hands-on sessions with our Community Groups and at conferences like TDX, and coming up with ideas of how to put agents to work for our individual business need, even bringing that to fruition through events like […]

READ MORE

Introduction to Agentforce for Salesforce Admins

By Kate Lessard | November 12, 2024

What is Agentforce? We are living in the artificial intelligence (AI) era, currently in the third wave of the AI revolution focused on contextual and generative AI and characterized by prompt-based generative AI, real-time AI applications, and autonomous agents. Agentforce is the suite of both assistive and autonomous agents built on the Salesforce platform. Agents […]

READ MORE

A Test-Driven Approach to Building Better Agents

Benefits of test-driven agent development

What does this mean for admins?

A 4-step approach to building a better test dataset and evals

Step 1: Understand Agentforce Testing Center

Step 2: Generate a high-quality, diverse test dataset

Add new test cases in the dataset to cover all important topics and scenarios

Step 3: Define evals and run tests

Pass/Fail Analysis

Step 4: Iterate and optimize

Build better agents today

Resources

Manjeet Singh

Related Posts

Employee Agents and Foundations: The Admin Advantage for User Support

Demo to Deployment: Engaging Stakeholders With Agentforce

Introduction to Agentforce for Salesforce Admins