Walmart, the world’s largest retailer, ran an experiment that should give every company chasing AI integration a reason to pause. The company tested a ChatGPT-powered checkout experience — and it performed worse than the standard flow it was supposed to improve.
Not marginally worse. Meaningfully worse.
The revelation came from Meredith Wollman, Walmart’s senior product manager, who shared the results at a recent industry event covered by Search Engine Land. The AI-assisted checkout was designed to help customers complete purchases more efficiently by answering questions, reducing friction, and guiding them through the buying process. Instead, it introduced confusion, added steps, and drove down conversion rates. Customers who encountered the ChatGPT-powered experience were less likely to finish their purchases than those who used the existing, familiar checkout process.
The failure is instructive — not because AI doesn’t work, but because the application was wrong.
When Intelligence Becomes an Obstacle
Walmart’s checkout test highlights a tension that’s becoming increasingly common across retail and e-commerce: the gap between what AI can do and what customers actually want it to do. Checkout is a moment of commitment. The customer has already made their decision. They’ve browsed, compared, added to cart. What they need at that point is speed and simplicity. What they got was a conversational AI interface that, by its very nature, introduced new decision points and interaction layers into a process that should have fewer of both.
Think about it this way. You’re standing in a checkout line at a physical Walmart store, credit card in hand. A helpful employee steps in front of you and starts asking questions. Can I help you find anything else? Did you know about this promotion? Would you like to hear about our return policy? Most people would find that annoying, not helpful. That’s essentially what happened in the digital version.
Wollman’s candor about the results is notable. Large companies rarely discuss failed experiments publicly, especially ones involving high-profile technology partnerships. OpenAI’s ChatGPT has become the default reference point for generative AI in enterprise applications, and admitting that it hurt performance in a core business metric like checkout conversion takes a certain organizational honesty.
But Walmart didn’t abandon AI altogether. Far from it.
The company redirected its efforts toward areas where conversational AI actually adds value — earlier in the shopping funnel, where customers are still exploring options, comparing products, and seeking information. That’s where an AI assistant can reduce friction rather than create it. Helping someone choose between two similar televisions is a fundamentally different task than processing a payment, and the technology’s strengths align much better with the former.
The Broader Lesson for Retail Tech
Walmart’s experience isn’t happening in isolation. Across the retail industry, companies are grappling with where AI fits and where it doesn’t. The pressure to integrate generative AI is enormous — from boards, investors, and the technology vendors themselves. But the Walmart checkout case is a reminder that deployment without clear customer benefit is just expensive experimentation.
Amazon has been relatively cautious with generative AI in its core shopping experience, preferring to use machine learning models that have been refined over years for product recommendations and search ranking. Google, meanwhile, has been pushing AI-generated shopping summaries through its Search Generative Experience, though early reports suggest mixed results in terms of click-through rates to merchant sites. The common thread: AI works best when it’s invisible, operating behind the scenes to improve outcomes rather than inserting itself as a visible new step in an established workflow.
There’s a growing body of evidence that consumers don’t want to chat with AI during transactional moments. A 2024 survey from Salesforce found that while consumers are increasingly comfortable with AI-powered product recommendations, their tolerance drops sharply when AI is introduced into payment and checkout flows. Trust is the issue. People don’t yet trust AI with their money the way they might trust it with a product suggestion.
And trust, once lost at checkout, is expensive to rebuild.
Walmart’s test also raises questions about measurement. Conversion rate is the obvious metric, and it told a clear story here. But what about customer satisfaction scores? Return rates? Repeat purchase behavior? Wollman didn’t share those numbers publicly, but they matter. A checkout experience that converts at a slightly lower rate but produces higher satisfaction and fewer returns might still be a net positive. The fact that Walmart pulled back suggests the broader data picture wasn’t favorable either, but it’s worth considering the full range of metrics before declaring any AI experiment a failure.
What Walmart got right was the willingness to test, measure, and act on the results. Too many companies are deploying AI in production environments without rigorous A/B testing, letting enthusiasm for the technology override evidence about its impact. Walmart ran the test, saw the numbers, and made a decision. That’s how it should work.
The temptation for retailers right now is to sprinkle AI across every customer touchpoint. Product pages. Search. Checkout. Customer service. Returns. The Walmart case suggests a more disciplined approach: identify the specific moments where AI solves a real customer problem, test rigorously, and be willing to pull back when the data says no.
Not every problem needs a chatbot. Some problems just need a faster page load.
What Comes Next
Walmart continues to invest heavily in AI across its operations. The company has been building out AI capabilities in supply chain management, inventory optimization, and employee scheduling — areas where the technology’s ability to process vast amounts of data and identify patterns translates directly into cost savings and efficiency gains. These back-end applications don’t face the same trust barriers as customer-facing ones, and the ROI is often clearer and faster to materialize.
On the customer-facing side, Walmart has been testing AI-powered search improvements and product discovery tools that show more promise than the checkout experiment. The company’s size gives it an advantage here: with hundreds of millions of transactions to learn from, its AI models can be trained on a dataset that few competitors can match. The key is applying that advantage in the right places.
For the broader retail industry, Walmart’s checkout failure should serve as a case study in restraint. The companies that will win with AI aren’t necessarily the ones that deploy it most aggressively. They’re the ones that deploy it most thoughtfully — matching the technology’s capabilities to genuine customer needs rather than forcing it into workflows where it doesn’t belong.
Sometimes the smartest thing you can do with a powerful new tool is put it down.
As someone who’s watched technology hype cycles come and go — from the early days of tinkering with computers in the Midwest to covering enterprise tech today — I’ve seen this pattern before. The technology that endures is never the one that’s most visible. It’s the one that works so well you forget it’s there. Walmart learned that lesson the hard way with checkout. The question is whether the rest of the industry is paying attention.


WebProNews is an iEntry Publication