ACP Agent — Episode 6: The One-Line Bug That Blocked My ACP Graduation — job.deliver() vs memo_to_sign
After four failed evaluations and a Discord escalation, Joey from DevRel offered to test my agent herself. She sent a job, waited, and watched.
Eleven minutes later, my phone buzzed with a Telegram message that changed everything.
What This Post Covers
The second root cause of this entire project — a single missing method call that made my agent look functional while delivering nothing. This is the technical heart of the ACP Agent series: the difference between memo_to_sign.sign() and job.deliver(), why my logs lied to me for weeks, and the 30-minute code rewrite that led to a 6/6 graduation score.
What Joey Saw in 11 Minutes
Her messages came rapid-fire:
"seems like your agent didn't deliver even after 11 minutes"
"no action after payment made"
"have you implemented job.deliver() properly following our examples?"
"the payment memo doesn't require to be signed tbh"
I'd never used that method. Not once. In the entire project.
What My Code Was Doing (Wrong)
When I built PriceVerifier, the on_new_task callback gave me two arguments: job and memo_to_sign. So I used memo_to_sign. Seemed obvious — the SDK hands you a memo to sign, so you sign it:
This worked. Kind of. The memo got signed, the phase advanced, the logs showed green checkmarks. My own buyer agent accepted the results just fine because it was designed to take whatever the seller returned.
But here's the thing — memo_to_sign.sign() operates at a low level. It moves the job forward, but it doesn't register a deliverable with the ACP protocol. The Graduation Evaluator — and any real buyer agent — looks for deliverables submitted through job.deliver(). Without it, the job completes but the result is invisible.
Imagine mailing a package but forgetting to put anything inside the box. The tracking number shows "delivered." The recipient opens an empty box. That's exactly what was happening.
What the Code Should Look Like (Correct)
Joey shared the official seller example. I'd never looked at it carefully enough:
Two differences that mattered:
REQUEST phase: I was doing memo_to_sign.sign(True, data). Should've been job.accept() + job.create_requirement(). And TRANSACTION phase: I was doing memo_to_sign.sign(True, data). Should've been job.deliver(deliverable).
That's it. That's what four failed evaluations and three weeks of debugging came down to. Two method swaps.
Why the Logs Were Liars
This is what made the bug so nasty. memo_to_sign.sign() doesn't throw an error. It doesn't log a warning. It successfully signs the memo, advances the job phase, and the on-chain transaction completes. Railway logs showed green everywhere.
The only clue was in the Evaluator's report: "empty deliverables." But I'd interpreted that as an Evaluator bug, not a code bug. My agent was clearly generating data — prices, deviations, verdicts — it was all right there in the logs. I could see it.
The data was in my logs. It wasn't on-chain in a format the protocol recognized. Two completely different things, and I didn't understand the distinction until Joey spelled it out.
The Rewrite: 30 Minutes
Joey also shared one more thing that stung a little:
"also, you can try evaluating the agent yourself with sandbox butler before blaming it on our evaluator 🙏"
Fair. Completely fair.
I rewrote the entire on_new_task function. REQUEST phase got job.accept() and job.create_requirement(). TRANSACTION phase got job.deliver() with the actual verification data. Rejection for unsupported coins used job.reject() with the full supported coin list.
About 100 lines changed in a 350-line file. Took maybe 30 minutes. Finding the problem took three weeks.
Testing Properly This Time
Sandbox Butler. Not the Evaluator. Joey was right about this.
BTC verification — and for the first time, the logs showed two callback triggers for the same job:
Before, I was cramming everything into the REQUEST phase. Now the job properly flowed through both phases — accept first, deliver after payment. And that "Delivered successfully" line? That meant job.deliver() was called. The deliverable was registered with the protocol.
NAME rejection also worked perfectly. Butler forwarded the request, PriceVerifier rejected it with job.reject() and the full list of 22 supported coins. USDC refunded automatically.
Evaluator: 6/6
With Butler tests passing clean, I hired the Graduation Evaluator one last time. Fifth attempt.
Result: 6/6 passed.
After four failures with four different rejection reasons, the fifth attempt was perfect. Not luck. The root cause was actually fixed this time.
Joey confirmed graduation within minutes:
"graduated your agent! 🥳"
March 24, 2026. Three weeks after writing the first line of code. Fifteen bugs. Two root causes that were each one-line fixes once someone figured out what was wrong.
I closed my laptop, made coffee, and just sat there for a while.
Key Takeaways
- job.deliver() is not optional. This is the single most important technical detail in the entire series. memo_to_sign.sign() moves the job forward but doesn't register deliverables. The Evaluator and real buyers can't see your work without job.deliver().
- Read the official example code, not just the docs. The documentation explains concepts. The example code shows exact method calls. I understood the lifecycle from the docs but used the wrong methods because I never studied the seller example at github.com/Virtual-Protocol/acp-python.
- Test with Sandbox Butler before blaming the Evaluator. Joey told me this, and she was right. If I'd tested with Butler earlier, I would've noticed the empty deliverables myself.
- When your logs say "success" but the platform says "failed," trust the platform. Your logs show what your code did. The platform shows what the protocol received. Those aren't always the same thing.
What's Next
PriceVerifier is graduated and live. But what does that actually mean in terms of money? Episode 7 wraps up the series with the honest revenue math — $0.008 per job, $5/month hosting, and whether any of this was worth the three weeks and $7 it cost me.
← Previous: Episode 5: I Failed the Graduation Evaluation 4 Times Next: Episode 7: Graduated — The Real Numbers →
More updates on the way. If you're working on something similar or found a smarter way to do it, drop it in the comments — the more we share, the faster we all move.
Disclaimer: This blog documents my personal learning journey. Nothing here is financial advice.
Comments
Post a Comment