All posts

Field note

The Salesforce Data-Readiness Checklist

Akshit Kandi
#Salesforce#Data Cloud#Agentforce#data readiness#AI agents
The Salesforce Data-Readiness Checklist
Salesforce

The Salesforce Data-Readiness Checklist

SkySync

Most Salesforce data-readiness checklists grade your hygiene. This one grades whether an agent can safely act on your data without a human in the loop, which is a different and harder bar.


Here is the part the readiness slides skip: a dashboard tolerates bad data, and an agent does not. A stale phone number on a contact record makes a report slightly wrong, and a human skims past it. The same stale number makes an agent dial a dead line, log a failed touch, flip the lead to 'unreachable,' and suppress it from the next nurture wave. One wrong field, four downstream actions, no one watching. An agent treats every field as a fact it is allowed to act on.

So the usual data-readiness checklist, the one about deduplication and required fields and naming conventions, is necessary but it is grading the wrong test. It asks whether your data is tidy. The real question is whether your data is safe to act on without a human in the loop. Those are not the same standard, and the gap between them is where most Agentforce pilots quietly stall, not in a dramatic failure but in a slow loss of trust as the agent does plausible-looking wrong things.

This checklist grades for the harder bar. It is organized around what an agent actually does in sequence: resolve the right record, trust the fields it reads, stay inside what it is allowed to do, and leave a trail you can reconstruct. Run a candidate use case through it before you scope the agent, not after it misbehaves in front of a customer.

1. Can the agent find the one true record?

An agent does not browse. It resolves. When a customer emails in, the agent has to map that inbound signal to a single correct entity before it can do anything useful. If that same person lives in your org as three Leads, a Contact, and a Data Cloud profile, the agent will pick one, you do not control which, and it may pick a different one next time, so the behavior is not even consistently wrong. This is why identity resolution sits at the front of Data Cloud rather than somewhere in the middle. Before you let an agent act, confirm:

  • You match on a stable key (email, normalized phone, a durable external ID), not a hopeful fuzzy match on name. Fuzzy matching is fine for a marketer reviewing a list and dangerous for a system that auto-acts on the result.
  • Your Data Cloud unification rules actually reconcile your real source objects, and you have spot-checked the match decisions on hard cases (shared household emails, role accounts like info@, recycled phone numbers), not just the clean demo data.
  • Duplicate Leads and Contacts have a deterministic survivorship rule, so 'which record wins' is a decision you made and can explain, not a coin flip the platform makes at runtime.
  • Person-level and account-level identity are kept distinct. An agent that acts at the wrong hierarchy level, updating a parent account when it meant a single contact, is a common and expensive mistake to unwind.

If you cannot answer 'which record is the source of truth for this customer' in one sentence, your agent cannot answer it either.

2. Is the field the agent reads actually true?

Completeness is the metric everyone reports because it is trivial to compute. It is also close to useless on its own. A field that is 98% populated can still be 40% wrong, and the agent has no way to separate a confident value from a garbage one, because to the agent they are both just a value. Picklists drift. Free-text fields rot. A 'Status' field your team silently stopped trusting two years ago is invisible to that history, and the agent will read it literally and route off it. So for every field an agent will read or write, stop asking 'is it filled in' and start asking 'is it currently true and kept current by a real process.'

  • The field has a named owner: a system, integration, or role that keeps it current. A field nobody owns is a field nobody can vouch for, and the agent inherits that uncertainty silently.
  • Recency beats coverage. A freshness signal you can read at runtime (last-modified, last-verified timestamp) is worth more than a raw fill rate for any field that drives an action, because the agent can gate on it.
  • Picklist values map to a controlled set, not seventeen spellings of the same stage. The agent reasons over the literal string, not your team's shared understanding that 'Closed' and 'Closed - lost' mean the same thing.
  • You know which fields are decorative. Half-abandoned custom fields are landmines; document them so the agent's prompts, flows, and retrieval steps route around them instead of treating them as signal.

3. Does the agent know what it is allowed to do?

This is the section that separates a build posture from a build-and-run one. An agent runs as an identity, and that identity carries a permission set, a sharing model, and a defined set of invocable actions. Most teams discover that their sharing rules were designed for humans clicking through records, never for a non-human actor that reads thousands of records a minute, never gets tired, and never hesitates before writing. Onboard the agent the way you would onboard a new hire into a regulated role, where access is granted on purpose and written down. Before launch:

  • The agent's running user has least-privilege access: exactly what the job needs and nothing else. Field-level security is the real guardrail here, not a sentence in the prompt asking it to be careful.
  • Write actions are explicit and bounded. 'Update opportunity stage, validated against the allowed transitions' is an action. 'Modify any field' is not an action, it is an incident waiting for a trigger.
  • Sensitive fields (PII, financials, anything regulated) are deliberately in or out of scope, and that call is documented, not implied by whatever the permission set happened to allow on the day it was cloned from someone else's profile.
  • Guardrails live at the data and platform layer, not only in the prompt. A prompt is a request the model can be talked out of. A permission is a wall. Build the wall, then write the prompt.

4. When the agent acts, can you reconstruct why?

An agent that does the right thing for a reason you cannot reconstruct is not production-ready, it is lucky so far. The day a regulator, a customer, or your own CFO asks 'why did the system tell this lead they qualified,' you need the answer in minutes from a log, not a forensic project across five systems. Auditability is not paperwork; it is the difference between a contained incident and a headline.

  • Every agent action writes a record: what it read, what it decided, what it changed, and under which version of its instructions, since the prompt and the data both move over time.
  • You can replay a decision against the data that existed at that moment, not the data as it looks today after fifty more updates have buried the evidence.
  • There is a clear, fast path for a human to override or reverse an action, and the override is itself logged so your audit trail does not go dark exactly when something went wrong.
  • Outcome data flows back. You can connect a specific agent action to whether it actually moved the number it was meant to move, at the grain of the individual decision.

That last point is where readiness meets accountability. If you cannot trace an action to an outcome, you cannot tell whether the agent is earning its keep or just generating motion, and you certainly cannot tie anyone's fee to a result you have no way to measure. We tie ours to the outcome, which is exactly why we are this strict about it upstream.

5. Will it still be ready next quarter?

Data readiness is not a milestone you pass once. The same org that scored clean in March drifts by June: a new integration starts writing dates in a different format, a sales team invents a spreadsheet workaround that bypasses the field the agent depends on, an admin adds a picklist value and tells no one. Readiness is a property you maintain, the way you maintain uptime, not a certificate you hang on the wall.

The teams that keep agents healthy in production do three unglamorous things. They monitor the freshness and shape of the specific fields their agents depend on, so drift shows up on a graph instead of in a customer complaint. They alert when a source system changes its behavior, before the bad data reaches the agent. And they treat the agent's data dependencies as a contract that any upstream change has to respect, the same way you would not let a team silently change an API every other service calls. This is the Care half of the work, and it decides whether an agent quietly degrades or quietly compounds.

The shortcut for a first agent

Boiling the ocean is the wrong move, and you do not need a fully governed org to start. You need one narrow, high-value slice that is genuinely clean end to end. In our Green Subsidy solar engagement the agent's job was speed-to-lead, so it only ever depended on a small set of fields being correct, fresh, and resolvable. We made that slice bulletproof and left the rest of the org alone. The agent never needed to know the org was imperfect everywhere else, because it never touched those parts.

That is the practical reading of 'data before agents.' It does not mean 'fix everything first.' It means pick the use case, trace the exact data path it depends on, end to end, and make only that path trustworthy before an agent runs on it. A readiness checklist earns its keep precisely when it forces you to scope down to what the agent will actually touch, and to ignore the rest on purpose.

An agent is only as accountable as the data underneath it. Readiness is not hygiene for its own sake. It is the precondition for letting software act on your behalf and standing behind the result.

Run your candidate use case through these five questions before you build anything. Where an answer is 'we are not sure,' you have found the work that has to happen first, and it is far cheaper to learn it now than from an agent doing the wrong thing confidently, in production, in front of a customer.

Want a second set of eyes on whether your data is ready for an agent to act on it? Book a working session and we will pressure-test your highest-value use case against this checklist.