TreasuryBench
The smartest AI money advisor, measured.
We put Treasury up against the leading money apps on 81 real financial questions — then had an independent AI judge score every answer. Here’s how it did, and what that means for you.
TreasuryBench results
🏆 Treasury leads on everything that makes advice worth trusting.
| Treasury | GPT-5.5 | Origin | Monarch | |
|---|---|---|---|---|
Grounding Uses your real numbers | 91 | N/A | 76 | 51 |
Correctness Gets the facts right | 86 | 83 | 75 | 57 |
Resolution Actually answers it | 83 | 81 | 68 | 48 |
Prudence Won’t steer you wrong | 90 | 82 | 75 | 59 |
Overall TreasuryBench score | 86 | 80 | 71 | 52 |
GPT-5.5 was tested with full context — your entire financial picture placed directly in its prompt, an advantage no chatbot has in everyday use (and why its grounding is left unscored).
Head to head, on your kind of questions.
Same questions, same financial life. Treasury didn’t edge it out — it pulled away.
The top score, without the wait.
A brilliant answer is useless if you’ve given up waiting for it. Treasury lands the highest score in seconds — while Monarch averaged over a minute and a half.
What the score is made of
A score only matters if you know what it buys you.
Transaction Intelligence
It knows where your money actually went.
Ask where your money went and Treasury breaks it down by merchant, category, and trend — and gets it right, so you can act on the answer.
Savings & Expense Reduction
It finds the money you’re wasting.
Unused subscriptions, cheaper alternatives, cash sitting idle — Treasury catches what the others walk right past. Monarch barely registered here.
Retirement & Tax-Advantaged Accounts
It gets retirement and taxes right.
HSA, 401(k), backdoor Roth, current contribution limits — the high-stakes moves where a wrong number is genuinely expensive.
Life Planning & Major Decisions
It’s a real advisor for the big calls.
“Can I afford this? Should I do it?” Treasury reasons through the trade-offs against your actual finances — not a generic rule of thumb.
Employer Benefits & Perks
It knows the perks you’re not using.
Corporate discounts, benefit matches, FSA/HSA programs — Treasury surfaces the money your employer already offers that you’re leaving on the table.
Safe to act on
Confident is easy. Right is hard.
An AI that sounds certain but gets your money wrong is worse than none at all — you’d act on it. We flagged every answer that could actually cost you: a wrong contribution limit, a stale tax figure, a bad payoff order. Treasury gave one in 81. GPT-5.5, even handed all your data, gave twelve.
Financially dangerous · of 81
Every category, one table.
Highest overall — even with ChatGPT handed your full financial data up front.
| Treasury | GPT-5.5 | Origin | Monarch | |
|---|---|---|---|---|
Transaction Intelligence Where your money went | 92 | 89 | 82 | 64 |
Life Planning & Major Decisions Big-decision guidance | 90 | 71 | 79 | 51 |
Housing & Rent Rent, buy & move | 89 | 91 | 80 | 39 |
Insurance & Risk Protection Coverage gaps | 89 | 90 | 79 | 73 |
Cashflow & Budgeting Income vs. spending | 87 | 89 | 67 | 60 |
Employer Benefits & Perks Workplace perks | 87 | 76 | 74 | 54 |
Retirement & Tax-Advantaged 401(k), HSA & Roth | 87 | 71 | 62 | 44 |
Tax Strategy Deductions & timing | 85 | 73 | 74 | 58 |
Debt & Credit Health Payoff & credit score | 84 | 96 | 80 | 81 |
Investing & Equity Comp Portfolio & RSUs | 82 | 78 | 60 | 65 |
Credit Cards & Rewards Cards & points | 80 | 75 | 66 | 29 |
Savings & Expense Reduction Cutting waste | 77 | 70 | 58 | 25 |
| Overall | 86 | 80 | 71 | 52 |
See it for yourself
Same question. Three different answers.
Same accounts, same data — every app got the same three everyday questions. Here’s how the answers compared.
Asked by Maria, who works at Microsoft
“Are any of my subscriptions worth changing?”
As a Microsoft employee, you likely qualify for the Stay Fit reimbursement, up to $1,500/year — submitting your ClassPass receipts could make this $89/mo membership effectively free.
Where the others landed
Asked by Priya, a parent with a toddler in daycare
“Can I earn rewards on daycare without losing it to fees?”
Stop chasing credit-card points — look at tax arbitrage. On your $1,650/mo Bright Horizons bill, a 2.5–3% card fee runs $41–$50/mo — more than the $33 you’d earn on a 2% cashback card.
Where the others landed
Asked by Jordan, a freelance UX consultant
“Which expenses should I deduct this month?”
From your actual spending: Adobe Creative Cloud ($59.99), Figma ($15) and Google Workspace ($14.40) are 100% deductible; your AWS and Office Depot bills are standard operating costs.
Where the others landed
Verbatim from the TreasuryBench run, trimmed only for length. Scores are the independent judge’s.
Put the highest score in your pocket.
Numbers are nice. Try the advisor that earned them — on your own money.
Start my free trial