Klaviyo & EmailJanuary 22, 2026

Klarna Replaced 700 Customer Service Agents With AI. Then They Rehired the Humans. The DTC Lesson Is Brutal.

Klarna's AI handled 2.3M chats and saved $40M. Then quality cratered, customers revolted, and the CEO admitted 'we went too far.' Here's the actual playbook for DTC brands deploying AI in customer service — without making the same mistake.

Mark Cijo

Mark Cijo

Founder, GOSH Digital

Klarna ran the most public AI customer service experiment in fintech.

Then it quietly reversed it.

If you're a DTC brand thinking about replacing your support team with Klaviyo Customer Agent, or any of the other AI customer service tools that shipped in the last 18 months — read this before you fire anyone.

What Klarna actually did

In early 2024, Klarna's AI assistant handled 2.3 million conversations in its first month — the work of roughly 700 full-time support agents. The numbers Klarna's CEO Sebastian Siemiatkowski put out at the time looked like the future:

  • Resolution time: under 2 minutes vs. 11 minutes for human agents
  • Volume: 2/3 of all customer service chats handled by the bot
  • Cost saved: $40 million projected annual profit improvement
  • Headcount impact: equivalent of 700 jobs absorbed

For about 18 months, every fintech CEO and DTC founder I talked to wanted to know "how do we do what Klarna did." That was the headline story.

What everyone missed

The Klarna AI assistant was very good at handling routine, single-turn inquiries. "Where's my order?" "Can I change my address?" "How do I update my payment method?" These are the high-volume, low-complexity questions that make up roughly 60–70% of any DTC brand's support ticket queue.

For that 60–70%, Klarna's AI was a genuine win.

For the other 30–40%? The bot collapsed.

Multi-step problem resolution. Emotionally charged interactions. Complex edge cases. Customers in genuine distress about a missed payment or a charge they didn't recognize. The AI handled the volume but not the complexity. Customer satisfaction scores dropped, Trustpilot reviews turned negative, and Klarna's brand reputation took a real hit during the run-up to their US IPO.

By mid-2025, Klarna started hiring human agents back. Siemiatkowski admitted publicly: "We went too far."

The framing matters. He didn't say AI failed. He said they pushed it past its actual capability and the customers paid the price.

The lesson for DTC brands isn't "don't use AI"

It's "deploy AI in the work it's actually good at, and stop trying to make it do the work humans are still better at."

We're a Klaviyo Gold Partner — we run AI inside email and SMS programs at scale for clients. We're modelling Klaviyo's new Customer Agent for DTC brands right now. Here's the framework we use:

Tier 1 — Hand to AI completely

These are the tickets where the answer exists in your knowledge base and the conversation almost always ends after one or two turns.

  • "Where's my order?" (with tracking integration)
  • "What's your return policy?"
  • "Do you ship to my country?"
  • "How do I track my subscription delivery?"
  • "Can I update my address before this ships?"

AI handles these at 95%+ accuracy, in seconds, 24/7. Your team should not be touching these in 2026. If they are, you're burning labor you can't justify.

Tier 2 — AI drafts, human reviews

The grey zone. Questions where AI can compose a competent response but needs human verification before it goes out.

  • Order issue requiring a refund or partial credit
  • Product-fit question requiring brand judgement
  • Returns where the customer is borderline frustrated
  • Loyalty / VIP questions that touch retention strategy

AI assist plus human-in-the-loop is roughly 3x faster than human-only for this tier and roughly 4x more accurate than AI-only. This is the sweet spot most brands aren't yet exploiting.

Tier 3 — Humans only, always

The work Klarna got wrong by pushing AI into.

  • Customer in distress about a failed payment
  • Wrongful charge accusation
  • Brand crisis / public complaint
  • Anything involving safety, fraud, or legal exposure
  • High-LTV VIP customer with a complex multi-step issue

Replace these with AI and you don't save money. You destroy trust that took years to build. Klarna did. Don't repeat it.

How this connects to your email and SMS program

Here's the part nobody connects, and it's the part that matters most for DTC.

The brands that ran flow-based retention programs before AI customer service existed already understood Tier 1 / Tier 2 / Tier 3 thinking — they just applied it to email instead of support. We covered the same pattern in our Phoenix case study where we built 15-20 automated flows handling the 80% of subscriber interactions that follow predictable triggers (browse, cart, post-purchase, replenishment) — while reserving campaigns, founder voice, and human strategy for the 20% that doesn't.

That's the same model.

Email & SMS marketing at scale only works when you automate the right 80% and humanise the right 20%. Customer service at scale only works the same way. The brands trying to AI-everything are repeating Klarna's mistake. The brands trying to human-everything are bleeding cash in support overhead they don't need.

The play is to identify the tier, deploy the right intervention, and measure rigorously. It's not glamorous. It's just how it works.

What we'd do this quarter for your brand

If you're running on Klaviyo and considering Customer Agent (or any of the other AI customer service tools — Gorgias AI, Tidio, Intercom Fin, etc.), here's the rollout plan we'd run:

  1. Audit your last 1,000 tickets. Tag them by tier. You'll find roughly 60% Tier 1, 25% Tier 2, 15% Tier 3 for a typical Shopify brand. That's the math that drives every decision after.
  2. Deploy AI on Tier 1 only for 30 days. Measure resolution rate, customer satisfaction, escalation rate. Compare to your historical baseline.
  3. Expand to Tier 2 with human review for the next 30 days. Track time-to-resolution improvement and accuracy. If your AI's draft accuracy is below 80%, it's not ready for this tier yet.
  4. Never let AI touch Tier 3 alone. Build clear escalation rules and make them visible to the customer.
  5. Report on cost saved per ticket AND customer satisfaction AND escalation rate. Klarna saved cost and tanked satisfaction. That's not a win, that's a brand liability.

We use a similar framework to scope AI automation projects for clients. The cost of getting it wrong isn't the price of the tool. It's the price of the customer trust you lose.

The honest summary

AI customer service works. It works brilliantly for the routine 60% of tickets. It works well for the next 25% with human oversight. And it fails badly for the final 15% — exactly the tickets that determine whether your brand reputation survives the next 12 months.

Klarna learned this in public. They saved $40M and then spent it back rebuilding the trust they lost. You don't have to repeat their experiment to learn the lesson.

If you're rolling out AI customer service in 2026 and want a tiered framework that doesn't blow up your CSAT in month two — we can help. Get a free audit of your current support + retention stack and we'll map the right tier for each part of your customer journey.

We're a Klaviyo Gold Partner agency with 150+ DTC clients on the platform. The brands that grew during the AI transition were the ones that thought about tier discipline before they thought about cost savings. Like If It Barks tripled email revenue by treating emotional customer connection as a Tier 3 problem and routine engagement as a Tier 1 problem.

Same logic. Different channel.

Sources:

Mark Cijo

Written by Mark Cijo

Founder of GOSH Digital. Klaviyo Gold Partner. Helping eCommerce brands grow revenue through data-driven marketing.

Book a free strategy call →

Want results like these for your brand?

Book a free call. We'll look at your data and show you what's possible.

Pick a Time

15 minutes. No pitch deck. Just your data and our honest take.