Skip to content

zaidazmi/AI-PM-PLAYBOOK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI PM Playbook · Ship AI features you can put your name on

ai-pm-playbook.com

MIT License Templates Guides Case Studies

Ship AI features you can put your name on.
The templates, evals, and launch gates AI PMs use to turn working demos into production calls they own.

Vibe-coding gets you to a demo fast. This playbook helps you make the harder call: whether that demo should become a product, where humans need to stay in the loop, what evidence would make it safe to ship, and when to say "not yet."

Built for PMs working with LLMs, agents, copilots, RAG, and workflow automation.

If useful, a GitHub star helps me know this is worth maintaining.

New here? Start with A week with the AI PM Playbook — a walkthrough of one PM using these artifacts on an actual product, from opportunity brief to roadmap review.

Choose your path

I have an idea

Before you vibe code -> Opportunity brief -> AI PRD -> Eval plan -> Launch gate

Use this path when the problem is still fuzzy and you need to decide whether AI is worth building at all.

I already built a prototype and now I'm nervous

Error analysis -> Eval plan -> PRD risk table -> Observability plan -> Launch gate

Use this path when the demo works, but you do not yet know whether the product is safe, measurable, affordable, or ready for users.

Quick start

  1. Before you vibe code — answer 8 questions before building
  2. AI opportunity brief — decide if AI is worth pursuing and align user, AI job, human control, evals, risk, and cost
  3. AI PRD — define what the AI does, its quality bar, risks, and what happens when it fails
  4. Eval plan — define "good" before trusting model output
  5. Human review workflow — decide who validates, corrects, escalates, or blocks AI output before it matters
  6. Launch gate checklist — make a go/no-go call for pilot, production, or scale
  7. Healthcare intake example — see what a "do not launch" recommendation looks like

The full playbook has the operating model, evidence hierarchy, readiness scoring, and decision framework.

When this playbook tells you to stop

"Do not launch" is not a failure state. It is a product decision when the evidence says the blast radius is larger than the team's ability to measure, review, roll back, or operate the AI safely.

Stop or hold when evals are missing, human review is undefined, agent rollback is impossible, data permissioning is unclear, cost exceeds the business case, or legal/security review has not happened for a high-risk workflow. A convincing LLM demo is not evidence that the product can act safely in the real workflow. Use the Launch Gates guide to make that call with evidence.

Who this is for

  • PMs shipping AI features from prototype to production
  • Founders deciding which AI workflows are worth building
  • Product leaders reviewing whether an AI roadmap is credible
  • Engineering, design, and legal partners who want clearer AI product artifacts

This is not a prompt pack or a strategy deck. There are no starter apps.

What an AI PM actually does

Most of these jobs didn't exist three years ago. Each one has a template.

Skill What it means Artifact
Opportunity assessment Decide whether AI is worth pursuing and align user, AI job, human control, evals, risk, and cost Opportunity Brief
AI job definition Specify what the AI does, its constraints, and its fallback behavior AI PRD
Eval design Define "good" before trusting model output Eval Plan
Risk management What can go wrong, how bad is it, what do we do about it PRD risk table + Launch Gate
Human-in-the-loop design Decide who validates, corrects, escalates, or blocks AI output before it matters Review Workflow
Unit economics Cost per workflow and margin impact at scale Cost Model
Launch gating Go/no-go calls using evidence Launch Gate Checklist
Observability Monitor quality, drift, and cost in production Observability Plan
Post-launch review What actually happened vs. what we expected Observability Plan
Optional handoff and operations Build handoff, meeting review, and prompt change control Optional templates

Guides

Twelve guides on the parts of AI product management where most teams get stuck.

Guide What it covers
Before You Vibe Code Eight questions to answer before turning an AI idea into a demo
Walkthrough A week with the playbook: one PM, one product, five artifacts
Eval Design Building evals that catch real failures, including the ones you miss in demos
Agentic Products How to spec agents vs. chatbots vs. copilots
Operating AI Products Human review, safety, observability, and cost discipline after the demo works
Launch Gates How to say "do not launch" with evidence
Prompt Craft Treating prompts as product surfaces
Bad to Good AI PRD Turning a vague AI assistant brief into a buildable PRD slice
Error Analysis Reading traces, labeling failures, and deciding which evals are worth automating
Artifact Flow Map What artifact comes when, who owns it, and what decision it unlocks
Agent PM Starter Pack Tool boundaries, autonomy, rollback, trajectory evals, cost ceilings, and handoff
AI-Native PM Loop Build small PM agents, trace behavior, create evals from traces, and improve safely

Case studies

Three worked examples. Each one includes an opportunity brief, PRD, eval plan, launch gate assessment, and a scored readiness recommendation. The customer support example also includes a week-2 post-launch review to show the operating loop after pilot launch.

Case study Risk Recommendation
Customer Support Copilot Medium Pilot after blockers resolved
Sales Call CRM Assistant Medium Pilot after blockers resolved
Healthcare Intake Assistant High Prototype only

The examples are synthetic but realistic. They show how the artifacts reason through tradeoffs rather than filling in blanks.

Portfolio interview prompts

Use these artifacts to answer common AI PM interview questions with concrete examples.

Interview question Where to point
How do you decide if an AI feature is worth building? Opportunity Brief + Healthcare Intake opportunity
How do you define quality for LLM output? Eval Plan + Customer Support eval
How do you handle hallucination risk? AI PRD risk table + Customer Support launch gate
How do you decide not to launch? Launch Gates guide + Healthcare Intake launch gate
How do you operate after launch? Observability Plan + Week-2 post-launch review

Repo structure

ai-pm-playbook.md          # Full playbook: operating model, scoring, gates
templates/                  # 7 core PM artifacts plus 3 optional templates
docs/                       # 12 reference guides (including walkthrough)
examples/                   # 3 scored case studies, plus one post-launch review example
schema/                     # JSON schema for readiness assessments

Companion framework

GRIT covers the engineering side: how AI-assisted code gets specified, tested, and reviewed. This playbook covers the product side: what gets built, why, and when it is ready.

License

MIT

About

Playbook for PMs shipping AI products with PRDs, evals, HITL, launch gates, cost, and observability.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors