All posts

Field note

What Ex-Salesforce PMs Know About Agentforce That Consultants Don't

Akshit Kandi
#Agentforce#AI agents#Salesforce#AI ROI#implementation
What Ex-Salesforce PMs Know About Agentforce That Consultants Don't
Agentforce

What Ex-Salesforce PMs Know About Agentforce That Consultants Don't

SkySync

The people who built Agentforce and the people who sell Agentforce projects optimize for different things. Here is the gap, and why it costs you money.


I was a Senior PM at Salesforce, on Agentforce. Here is the thing that took me a while to say out loud: the people who build a platform and the people who sell projects on it optimize for different things. A product team is graded on whether the platform succeeds across thousands of orgs it will never personally see. A consultant is graded on whether a statement of work closes and passes acceptance. Those incentives overlap right up until go-live, and then they split. That split is where most Agentforce money quietly leaks out.

This is written for the executive about to sign the SOW, and for the architect who will inherit whatever that SOW builds. The platform is good. The variance in results is almost never the model. It's a handful of decisions that get made — or skipped — before anyone writes a topic.

The demo is a lie of omission, and the PM knows exactly which part

Every Agentforce demo works. The data is clean, the topics are scoped tight, and the test questions were written by the same person who built the agent. That's not dishonesty — it's what a demo is for. But a demo is the happy path, and the happy path is a minority of real traffic.

The rest is the ambiguous question, the customer who's already angry before they type, the record that contradicts another record, and the request that should hand off to a human but doesn't trip the handoff condition. The platform handles most of this. But "handles it" and "handles it the way you'd stake your brand on" are different sentences, and the second one is the actual work. When you watch a demo, the question isn't "does it work?" It's "show me the three queries you hope I don't ask, and show me the transcript where the agent got one wrong and what you changed." A partner who has run an agent in production has those on hand. One who only builds them doesn't think to keep them.

Agentforce is a data product wearing an AI costume

The single most expensive misunderstanding I saw from the inside: buyers think they're buying an agent. They're buying a function of their data. The model is roughly the same for everyone. Your retrieval, your grounding, your field hygiene, your sharing rules, your Data Cloud mapping — that is the whole ballgame, and it's the part nobody demos because it isn't pretty.

Concretely: an agent grounded on a Data Cloud where the same customer exists as three unresolved profiles will answer confidently and wrong, because retrieval pulls a fragment of the truth. An agent that respects your sharing model will refuse to surface a record a rep shouldn't see — which is correct, but looks like a bug to anyone who didn't think about permissions before launch. None of this is an AI problem. You asked the model to reason over a swamp and it did exactly that. A shop paid to "stand up Agentforce" has every incentive to treat your data as a given and bill for the agent build. The agent is the easy 20%.

The order of operations is the strategy. Data, then agent. Reverse it and you've bought a very expensive way to be wrong faster.

This is why we lead every engagement with an Agent Ready phase — identity resolution, grounding sources, and permission boundaries — before anyone touches a topic or an action. Not because it's billable. Because skipping it is the most common reason these deployments underperform, and I watched that happen enough times to stop pretending the agent was ever the hard part.

"Launched" is the consultant's finish line and the platform's starting gun

The classic engagement ends at go-live. Demo passed, SOW closed, team rolls off to the next logo. From the product side, go-live is the moment the interesting data finally arrives — real users asking real things, in volumes and shapes your test set never imagined.

An agent's quality curve doesn't peak at launch; it bends based on what you do in weeks two through twelve. You read transcripts, find the turns where it deflected something it should have escalated, tighten a topic's instructions, add a guardrail, retire an action that never fired. A PM runs this loop by reflex because that's how products are run. Most project shops aren't structured to be in the room for it, because the contract already paid out. So ask any prospective partner one question: what happens in week six? If the answer is "a support handoff," you're buying a launch, not an outcome.

The features in the keynote are not the features you should ship first

Roadmaps move fast and keynotes reward novelty, so the newest capability gets the stage time — and then gets requested by buyers who saw the keynote. Someone who has sat in the prioritization meetings knows which capabilities are battle-tested and which shipped last quarter and are still finding their edges. The unglamorous truth is that the highest-ROI deployment is usually the boring one: a tightly scoped agent doing one high-volume task reliably, on capabilities that have been GA long enough to behave predictably under load.

Breadth is where you go to lose trust. Depth on one job is where you earn the right to expand.

  • Pick the task with the highest volume and the clearest correct answer, not the most impressive one.
  • Prefer a capability that's been GA for a few release cycles over the thing announced this quarter.
  • Ship narrow, measure honestly, then widen — and make each expansion earn its place with data.
  • Treat every new topic as a small bet with its own success metric, not a feature checkbox.

Nobody decides what to measure until it's too late to instrument it

This is the one that genuinely separates practitioners. Most Agentforce projects define success after launch, when an executive asks "so, is it working?" and the room realizes no one captured the baseline before traffic started. Now you're reconstructing the "before" from memory, which means you're guessing.

A product person decides the metric first and builds backward. Name the one number this agent is supposed to move — deflection rate, speed-to-lead, resolution time, conversion — and record what it was the week before launch. On our Green Subsidy solar work, the metric was speed-to-lead: get a qualified human response in front of an inbound homeowner fast enough to matter. You can only claim a result like that if you measured the before, and if the agent is emitting the events you need to measure the after. Instrument that on day zero, not the day the executive asks.

Say you deflect 10% of tickets today. The honest version of this project writes that number down before launch and parks the agent's live deflection rate next to it, in plain view, every week. An illustrative target like that is fine. Having no baseline at all and calling whatever happened a success is not.

Why we tie our fee to the result

Once you've seen the gap between "launched" and "working," the standard consulting model starts to look structurally wrong. It pays out at the exact moment the hard part begins, and the risk transfers entirely to you. So we built SkySync to sit on the other side of that line: we advise, build, and run the agents, and our fee is tied to the return they produce.

That isn't altruism — it's a filter. It forces us to refuse work where the data isn't ready or the outcome can't be measured, because we don't get paid for a pretty demo. That discipline is what I wish more buyers demanded of any partner, whether they hire us or not.

If you take one thing from an ex-PM: the technology is not your variable. Whether someone treated your deployment like a product to be run, or a project to be closed, is.

Want a candid read on whether your data and your use case are actually ready for an agent — and what number it's supposed to move? Model it with us.