1.Data Prep Checklist
Get your data ready in just a few minutes. Four files from your existing systems.
Pick the Right AI Model
Use a thinking/reasoning model — the kind that pauses before answering.
- •ChatGPT: Select "Thinking" mode
- •Microsoft Copilot: GPT 5.5 → "Think Deeper."
- •Google Gemini: Pro.
- •Claude Pro: Select Opus 4.6 or 4.7
Warning: Do not use a fast model. Fast models fabricate evidence — they'll invent quotes from project notes that don't exist and assign root causes without checking the change order log.
Your Four Files
Job Cost Summary
The spine of the analysis. Without it, nothing works.
Export closed jobs from your ERP as a CSV file (not .xlsx).
Maximum 25 jobs per run. The AI can only reason accurately across 25 jobs at a time — beyond that, it starts conflating job IDs and losing change-order linkage. Your quarter probably has 100-200+ closed jobs, so you need to pick 25 before uploading.
How you pick them is up to you. Some options:
- •Worst performers by dollar variance (recommended for your first run — see the formula below)
- •Jobs from a specific PM's book
- •Jobs from a specific branch or region
- •The most recent 25 closings
- •Any 25 you're curious about
The worst-performers approach gives you the highest-impact diagnosis. The other approaches are useful when you already know where to look.
Finding Your 25 Worst Performers
If you want to start with the jobs that leaked the most margin — which is what we do in the video — here's how to find them in Excel or Google Sheets:
- 1Open your full job cost export in a spreadsheet.
- 2Add a new column at the end. This formula calculates the dollar difference between what you expected to make and what you actually made on each job:
The formula:
= (actual_revenue - actual_total_cost) - (bid_revenue - bid_total_cost)Use your actual column names. For example, if your columns are in cells F2, H2, E2, and G2:
= (F2 - H2) - (E2 - G2)A negative number means the job lost money relative to the bid. The more negative, the worse the leak.
- 3Sort by that column, smallest first (largest negative numbers at the top).
- 4Keep the top 25 rows. Delete the rest.
- 5Delete the formula column — the AI recalculates variance from the raw bid and actual numbers.
- 6Save as CSV.
You now have your 25 biggest margin problems ready for diagnosis. Pull the estimates, change orders, and notes for just those 25 jobs.
| Priority | Field | Why it matters |
|---|---|---|
| Must have | Job ID | Join key across all four files |
| Must have | Bid revenue | Needed for margin calculation |
| Must have | Actual revenue | Needed for margin calculation |
| Must have | Bid total cost (or broken out) | Needed for margin calculation |
| Must have | Actual total cost (or broken out) | Needed for margin calculation |
| High | Project manager | Powers the PM pattern view |
| High | Job type or service line | Powers the job-type pattern view |
| High | Customer name | Powers the customer pattern view |
| Medium | Bid + actual labor hours | Distinguishes labor rate overrun from productivity miss |
| Medium | Bid + actual material cost | Identifies material price escalation |
| Medium | Bid + actual sub cost | Identifies sub overruns |
| Medium | Start date + close date | Shows schedule overruns |
| Nice to have | Branch / region / office | Powers the branch pattern view |
| Nice to have | Job name or description | Makes the output more readable |
If you can only export half the fields, get the "must have" and "high" columns. The analysis will still produce a ranked leakage table and root causes — you'll just lose some of the pattern breakdowns.
Original Estimates
One PDF per job from your estimating system. Should include scope of work, key assumptions, contingency, and bid totals. A single combined file works too. This is how the AI catches "estimating error" — when the bid assumptions were wrong from the start.
If your estimates aren't in a system, dig up whatever you have — spreadsheet, Word doc, scanned paper. The format matters less than having the original assumptions.
Change Order Log
A CSV with one row per change order. This is how the AI identifies recoverable margin — pending or unbilled COs that represent money still in play.
If your CO log is incomplete, the analysis loses its ability to identify recoverable margin and scope creep diagnoses get weaker. This is the file that turns the analysis from a post-mortem into an action list — without it, you can see what leaked but not what's still recoverable.
| Field | Why it matters |
|---|---|
| Job ID | Links back to cost file |
| CO number or description | Identifies the specific change |
| Dollar amount | Quantifies recoverable margin |
| Status | Determines recoverability (Signed / Pending / Unbilled / Rejected) |
Project Notes
A CSV from your field service platform or ERP. Project notes are where the real story lives. Without them, the AI can see that a job leaked but is weaker at explaining why.
If your notes live in email or texts, export what you can from the formal system. Even a few notes per job helps — the analysis flags lower confidence for jobs with thin records. The notes are what turn a variance number into an explanation your board can act on.
| Field | Why it matters |
|---|---|
| Job ID | Links back to cost file |
| Date | Establishes when things went wrong |
| Author role | Context for the observation |
| Note text | The narrative evidence |
| File | Format | Key fields |
|---|---|---|
| Job cost summary | CSV | Job ID, bid revenue, actual revenue, bid cost, actual cost |
| Original estimates | PDF (one per job) | Scope, assumptions, contingency, bid totals |
| Change order log | CSV | Job ID, CO amount, status |
| Project notes | CSV | Job ID, date, author, note text |
Max 25 jobs per run. CSV format for structured data. Use a thinking model.
2.Master Analysis Prompt
Copy this prompt and paste it into your AI chat before uploading any data. The prompt walks you through a step-by-step upload process — it will ask for each file in turn.
The prompt will walk you through uploading your data step by step — just paste it and follow along.
3.Pattern View Prompt
Run this after the master analysis finishes, in the same chat. Don't start a new session — the pattern view builds on the diagnosis the AI just produced.
The pattern view groups your results by PM, branch, customer, job type, and root cause — and tells you who to talk to first.
4.Go Deeper
The pattern view gave you the big picture. Now use the same chat to drill into specifics. The AI still has all four files and both analyses loaded — just keep asking.
Here are a few things to try:
Break down [PM name]'s jobs. What's happening in their book?
The pattern view showed which PM has the most leaked dollars. This gives you the full story — which jobs, which root causes, whether it's one bad job or systemic. Prep for a coaching conversation, not a blame conversation.
Show me every job where scope creep happened without a change order. What's the common thread?
Replace "scope creep" with whatever root cause dominated your results. The AI surfaces what those jobs share — same customer, same work type, same point in the lifecycle.
If I could only recover money from one job this week, which one and what exactly do I need to do?
Narrows to one recovery — the specific CO number, dollar amount, who to call, and what to say.
What would I need to change in my bidding process to prevent the top root cause from recurring?
Shifts from diagnosis to prevention — connects execution failures back to bid assumptions.
Watch For
If any of these show up in your results, that's a good reason to dig in. Pick an account and ask a follow-up question to learn more.
| If you see... | It probably means... | What to try |
|---|---|---|
| Evidence column says "no narrative evidence" on most jobs | Project notes didn't upload cleanly or are genuinely thin | Check the CSV uploaded. If notes are thin, the diagnosis is weaker — AI is being honest |
| Same root cause on every single leaking job | Model may be over-fitting to one pattern | Ask for specific evidence per job. If it can't show it, re-run on stronger model |
| Variance numbers don't match your source data | Model misread a column or confused bid with actual | Check CSV headers — rename ambiguous ones (e.g., "bid_revenue" not "revenue_1") |
| Recoverable column says "unknown" for everything | CO log didn't upload or job IDs don't match | Check the join key — IDs must match exactly across files |
| Quotes you can't find in your notes file | Model fabricated evidence | Switch to stronger reasoning model. Verify by opening the source, don't ask the AI |
| "No clear pattern" on every dimension | Jobs too diverse to cluster, or sample too small | Not a problem if 25 jobs span 8 PMs and 12 customers. Run a second batch by PM or type |
Every time you work with AI, ask one more question than you think you need to. That's the habit that compounds.
5.Next Steps
Act on What You Found
This week:
- •Chase the top 3 recoverable jobs. These are pending or unbilled change orders — real dollars waiting on paperwork.
- •Have one coaching conversation. The pattern view identified who to talk to and what about.
This month:
- •Address the top root cause pattern. If labor overrun on retrofit work is your #1 leak, that's a bidding conversation, not a field execution conversation.
Move to a Weekly Cadence
What to export each Monday:
- •Jobs closed in the prior week (typically 3-8 for a mid-market services business)
- •Their change orders, notes, and estimates — same four files, filtered to the new batch
What changes: Smaller batches (5-10 jobs), faster turnaround, recoverable column becomes more actionable because the CO window is still open.
Time commitment: 30-45 minutes per week once you have the export routine down. First time takes 60-90 minutes.
Track Whether It's Working
After 4-6 weekly runs, you should be able to answer:
- •Is total leakage dollars trending down?
- •Are the same root causes repeating? (If so, the process is broken, not the field execution)
- •Is recoverable margin getting captured?
- •Are specific PMs or job types improving?
When to Move Beyond This
This workflow has limits: 25 jobs per run, no trend memory across runs, 60-90 min of data prep, no action tracking. When you find yourself thinking "I wish this just ran automatically against my ERP every Monday" — that's the signal.