Acton
An agent that follows up on whether a task got done, calibrating how hard it presses to the task and the person.
Project overview
- Type: Product · 10-week 0→1
- Project type:
0→1·AI Agent·Productivity & Accountability·Behavior Change - Role: Lead Product Designer · Agent & Conversation Design · Character Design · UX Research
- Methods: Behavioral modeling (Fogg) · diary study · usability testing · A/B
- Tools: Otter.ai · Blender · Braze · Figma + Dev Mode · Amplitude · Statsig
- Case thesis: Designing an agent that follows up on whether a task got done, calibrating its insistence to the task and the person, so accountability changes completion without becoming the notifications people mute.
The context
To-do apps are passive ledgers. They capture a task and never notice when it goes undone, so the list grows while the work does not. Acton is built as a personal director of operations: it tracks tasks, projects, goals, and habits, and instead of letting a task sit, it follows up on whether the task was completed, reachable on the web, Telegram, and Discord, and through its own interface where a user can mark and view work directly. The hard part is the follow-up itself.
The problem
People do not fail at writing tasks down; they fail at finishing them, and the tool never reacts. In research, people completed only 33% of the tasks they created within the timeframe they set (behavioral, tracker audit), and a reminder that fired once and disappeared was acted on rarely. The behavior that changes completion is a follow-up that expects a response, yet the tools that try it tip into notification spam and get muted: among people who had used a reminder-heavy app, 57% had muted or abandoned one for nagging (attitudinal).
The goal
Drive task completion through follow-up that the user keeps switched on, measured by task completion rate, mute and abandonment, and retention, rather than by how many task fields the app offered.
Empathize — People completed only a third of the tasks they created, and a reminder that fired once and vanished did nothing
In this section: Research foundation · Key insights
Research foundation (method)
- Phase 1 — Interviews (n=20, ~40 min, busy professionals using task tools, recruited via dscout, transcribed in Otter.ai): how tasks die, and what an accountability partner does that an app does not.
- Phase 2 — Tracker audit (n=24): reviewed participants' actual task tools for created-versus-completed rates.
- Phase 3 — Survey (n=190; 18.0% response, 12.4% completion; select-all and single-select labeled per question): on reminders, nagging, and muting.
- Phase 4 — Prototype pilot (Amplitude-instrumented, 70 users, spring 2025): behavior on the follow-up loop.
Key insights
1. Capture is solved; follow-through is where tasks die. Participants had no trouble adding tasks and every trouble finishing them, and the app never noticed the difference. Triangulation:
- Behavioral: 33% of created tasks were completed in their timeframe; the rest sat or were silently abandoned.
- Verbatim — coded: Silent failure: "I add it, it sits in the list forever, and nothing and no one ever asks me about it again."
2. A follow-up that expects a reply is what moves completion. A one-shot reminder was dismissed and forgotten, while the prospect of someone asking again, the way a manager or coach would, was what got the task done.
- Attitudinal: the behavior people credited for finishing things was a person following up and expecting an answer.
3. Persistent follow-up sits on a knife-edge. Too much insistence got the tool muted; too little left it a silent list. The thing that keeps follow-up on the helpful side is calibrating it to the task's importance and the person.
- Attitudinal: 57% had muted or quit a reminder app for nagging, and the complaint was insistence that ignored what mattered.
Dashboard — Tasks die between capture and completion
Tasks die between capture and completion
Scope: tracker audit (n=24) + survey (n=190)
Guiding question: Why don't created tasks get done?
Tasks completed within their timeframe ..... 33%
One-shot reminders acted on ................ low
Muted or quit a reminder app for nagging ... 57%
Key Insight: Capture works and follow-through fails, and the tools that
follow up get muted, so calibration is the whole design problem.
Define — The agent had to notice unfinished tasks and follow up in proportion to the task and the person
In this section: POV · How Might We · Principles · Insight→decision map
POV statement. A person needs an agent that notices unfinished tasks and follows up in a way calibrated to the task and to them, because passive to-do lists never notice non-completion and indiscriminate reminders get muted.
How Might We
- How might we follow up persistently enough to drive completion without becoming notification spam?
- How might we calibrate insistence to a task's importance and a person's patterns?
- How might we make a non-response useful, learning why, and moving or archiving the task?
Design principles (each traceable to an insight)
- Follow-up is the product. The agent tracks completion and follows up, where a list only stores. (Insight 1, 2)
- Insistence is calibrated. How hard Acton presses scales with the task's importance, the situation, and the user's patterns. (Insight 3)
- The agent is a perceived someone. A consistent character and a serious, non-sycophantic operative personality make a user feel accountable to a person, the way they would answer to a manager. (Insight 2)
- Non-response is signal. A missed task prompts a question, a moved deadline, or an archive, and feeds pattern detection, so the list stays honest.
- Scope is bounded. Acton stays an operations agent and declines off-topic conversation, keeping its follow-up focused and credible.
Insight → decision map
| Insight (from Empathize) | Concrete design decision |
|---|---|
| 33% completion; tasks die unnoticed | A reminder expects a reply, and on silence Acton follows up rather than letting the task lapse |
| Insistence gets the tool muted | The nag carries a reason and a tactic chosen for the task's priority and the user's response pattern, with acknowledge-to-stop and a per-task insistence toggle |
| People respond to a perceived someone the way they would to a manager | A 3D, semi-realistic character and a precise, mission-oriented voice give the follow-up a consistent identity the user responds to |
Ideate & Craft — Follow-up was the product: a reminder expected a reply, and Acton pressed, helped, or archived from there
In this section: Design execution · Before → after · Other deliverables
Design execution
- The follow-up loop — a reminder is sent expecting a response; if none comes, Acton follows up, with a tactic chosen for the task's importance and the user's pattern: a nudge on a low-priority task, a pointed question on a high-stakes one, an offer to break it down or replan, and on continued silence a moved deadline or an archive.
- Calibrated insistence — the "nag" carries a stated reason for why a task matters, can be acknowledged to stop for now, and can be toggled per task, so persistence never runs blind.
- A perceived operative — a 3D, semi-realistic character gives Acton a face in every message, and a serious, precise personality that does not over-praise frames the relationship as accountability to a director of operations.
- The onboarding interview — sign-up is an interview that builds a profile of the user's work, projects, and difficulties, so follow-up is contextual from day one, with more questions asked over time.
- The work it runs on — tasks, projects with completion bars and deadlines, annual goals with milestones and counters, recurring habits, reminders, notes, events, and a financial overview form the substrate the follow-up operates against.
- On its own terms — Acton works across web, Telegram, and Discord, and its own interface lets a user mark tasks and view projects directly, without going through chat.
Before → after
| Before (passive to-do app) | After (Acton) | |
|---|---|---|
| When a task goes undone | Nothing happens | Acton follows up and expects a reply |
| Reminders | Fire once and vanish | Calibrated insistence with a reason |
| The relationship | A checkbox | A consistent operative you answer to |
| A missed task | Sits forever | Questioned, replanned, moved, or archived |
Other deliverables
Built in Figma with Dev Mode handoff: the follow-up and calibrated-insistence interaction model, the onboarding interview, the character and personality system across channels, and the project and goal tracking interface.
Dashboard — Calibrated follow-up lifts completion and stays on
Calibrated follow-up lifts completion and stays on
Scope: Last 30 days · pilot (70 users)
Guiding question: Did follow-up raise completion without getting muted?
Tasks completed within timeframe ...... 33% → 61%
7-day mute / disable rate ............. 11% (uniform nagging: 41%)
Reminders that got a reply ............ rose sharply
Key Insight: Pressing in proportion to the task and the person raised
completion while keeping the agent switched on.
Prototype / Test — Maximal nagging drove completion for a day and got muted within a week; calibrated insistence sustained both
In this section: The experiment · What it taught
The obvious way to maximize follow-through was to press hard on everything: Acton would insist on every task until the user responded. It was A/B tested against calibrated insistence in Statsig across the pilot.
The failed variant. Pressing uniformly spiked completion on day one, the highest of any variant, but within a week 41% of users muted or disabled Acton, and 30-day completion fell to 38% as the muted agent stopped reaching them. A follow-up agent that gets switched off completes nothing, so the day-one number was a mirage.
Press everything and get switched off
Scope: Statsig A/B · pilot · n=2 variants
Guiding question: Which insistence model completes the most tasks over time?
Variant A — Uniform maximal nagging
Day-1 completion ................ highest
7-day mute / disable rate ....... 41%
30-day completion ............... 38%
30-day retention ................ 29%
Variant B — Calibrated insistence
Day-1 completion ................ slightly lower
7-day mute / disable rate ....... 11%
30-day completion ............... 61%
30-day retention ................ 57%
Key Insight: Past a point more insistence lowered completion by getting the
agent muted; calibrating to the task and the person sustained both.
What it taught. For a follow-up agent, more pressure stops raising completion once it gets the agent muted, and a muted agent completes nothing; the design that lasts calibrates insistence to what matters and to the person, so it stays switched on. The calibrated model shipped.
Outcomes & reflections
In this section: Causal chain · Reflections
Causal chain (pilot, 70 users)
The calibrated follow-up loop noticed unfinished tasks and pressed in proportion to their importance and the user's patterns, so reminders got replies and task completion within timeframe rose from 33% → 61%, while the 7-day mute rate held at 11% against 41% for uniform insistence, so the agent stayed active long enough to keep working and day-30 retention reached 57%. Treating a non-response as signal, asking why, replanning, moving, or archiving, kept the list honest instead of letting dead tasks accumulate.
| Metric | Before | After | Δ |
|---|---|---|---|
| Tasks completed within timeframe | 33% | 61% | +28 pts |
| 7-day mute / disable rate | 41% (uniform) | 11% | ~−73% |
| 30-day completion | 38% (uniform) | 61% | +23 pts |
| Day-30 retention | 29% (uniform) | 57% | +28 pts |
Scale note: a follow-up agent changes behavior only while the user keeps it on, so calibration that prevents muting compounds, since every week it stays active is another week of completed work.
Reflections (transferable principles)
- For a productivity tool, capture is solved and follow-through is not; an agent that notices non-completion and follows up changes behavior a passive list cannot.
- Persistent follow-up sits on a knife-edge: past a point, more insistence lowers completion by getting the agent muted, so calibrating to task importance and the person is what makes accountability sustainable.
- People answer more readily to a perceived someone than to a checkbox, so a consistent character and a serious, non-sycophantic voice do real work in driving follow-through.
- Bounding an agent to its purpose, declining the off-topic conversation a chatbot invites, keeps it trusted as an operations tool and its follow-up credible.