2.3.3 - The One Metric That Proves This Works
Define the proof metric: say what the metric is, what user behavior creates it and what threshold counts as enough.
Proof metric
Is this metric proving real value or just reporting activity?
The call
Choose one metric before you measure everything. Otherwise AI helps you build dashboards that track activity while value stays invisible.
Why it matters
The one metric that proves this works should show whether users get the outcome the product promised. AI can surface metric movement quickly, but human judgement decides if the shift reflects real value or measurement noise. That judgement turns numbers into decisions and keeps you focused on what actually works.
Explainer
A proof metric is not a dashboard full of numbers. It is the one signal that tells you whether the work created the result you care about. Until you can name one metric, one user behaviour behind it and one threshold that counts as proof, measurement will stay fuzzy. AI can help analyse data, but it cannot decide which metric is the decision line.
Make the proof metric concrete
Compare the broad version with a version you can actually test.
- Too vague: We will track engagement and adoption of the search tool.
- Concrete enough to test: We will track how many content creators act on at least one context-shaped search result in the same session. We will treat two out of three searches producing an actionable result as proof that the context layer is working.
The second version lets two people make the same keep or cut decision from it.
Check the proof metric
- Pass: You can say what the metric is, what user behaviour creates it and what threshold counts as enough.
- Fail: If the metric still depends on general words like usage, growth or engagement, it is not clear enough yet.
Do not move into launch, iteration or analysis work until this passes.
How to use AI for the proof metric
- AI chat: Rewrite the proof metric until you can state all three parts clearly.
- vibeCoding: Build the thinnest flow that tests this proof metric in practice before broader build work.
- AI-assisted coding: Carry the same proof metric into implementation and review so the live system keeps the same decision.
Sharpen the proof metric
Copy this prompt into AI chat, replace the bracketed lines with your real proof metric and keep the instruction exactly as visible here.
You are checking whether this proof metric is clear enough before you move forward.
Constraint:
The proof metric must be specific enough that two people would make the same keep or cut decision from it.
Working draft:
Metric: [what the metric is]
User behavior behind it: [what user behavior creates it]
Threshold: [what threshold counts as enough]
Task:
Decide whether this proof metric is specific enough to guide the next decision. If it is vague, rewrite it so two people would make the same decision from this proof metric.
Check:
- Would two people interpret this the same way?
- Does it stay concrete enough to guide the next step?
- Does it meet this bar: You can say what the metric is, what user behavior creates it and what threshold counts as enough.
Return:
- A corrected proof metric
- A short explanation of what was vagueCopy this into AI chat. Replace the bracketed parts. Keep the rest unchanged. AI will likely suggest refinements based on what you enter. Use those to sharpen your thinking, not replace it. Create a free account to save your answers and pick up where you left off.
Evaluation
Before accepting the result, check whether two people would make the same keep or cut decision from it.
Example
To help you work through this, here is a real example. StartWithYourContext is an AI search tool built as part of the vibe2value project. Here is how its proof metric was written using the three parts:
- Metric: Actionable result rate per search session.
- User behaviour behind it: A content creator searches with their saved context and acts on at least one result in the same session instead of leaving to search elsewhere.
- Threshold: At least two out of three searches produce a result the user acts on.
That proof metric is specific enough that two people would make the same keep or cut decision from it.
When there is more than one side
Not every product has a single proof metric. When a system serves more than one side, each side proves value through a different signal and a metric that looks strong for one side may say nothing about the other.
Multi-sided worked example
For example, StartWithYourContext has two different proof metrics:
- Content creator: Actionable result rate. Do context-shaped searches produce results the user acts on? The threshold is two out of three.
- Developer: Setup completion rate. Can a new developer clone, set up and run the project from the README without getting stuck? The threshold is reaching a working state independently.
Both metrics prove value, but they measure different things. If only one is tracked, the other side’s value stays unproven.
Risk and mitigation
- Risk: Optimising for a metric that looks strong while user value stays flat, which can push the build in the wrong direction.
- Mitigation: Pair the core metric with one user-impact check and only scale changes when both move in the same direction.
Key takeaway
Do not move forward until you can say what the metric is, what user behaviour creates it and what threshold counts as enough.
Work through this in a workshop
If your proof metric is still unclear, bring it to a free weekly workshop. Bring the messy part of your AI-assisted build and leave with a clearer next step. In some sessions, we walk through practical examples on the Cloudflare Workers stack to show how a rough idea turns into something that actually runs.
What do you think?
How are you choosing the one metric that proves your build is working and how is AI helping you act on that signal?