Coles Group is using generative AI to distil about 40,000 comments from customers each week into a handful of key themes and insights that can be actioned at a store level.
(L-R) Silvio Giorgio and Richard Walker from Coles.
The comments are taken from a weekly survey sent to a sample of customers each week after they interact with the retailer.
It typically nets about 20,000 responses which include numerical scores that get converted into a single net promoter score (NPS) measure, as well as unstructured comments – “customer verbatims” in marketing speak – about the customer’s individual experience.
While NPS had become a “common currency” internally at Coles as a measure of customer satisfaction, it often did not provide a full or accurate picture.
“The challenge is that customers give us a score, and we think all is good, but if you look at the comments, actually the world isn’t wonderful,” data and intelligence general manager Silvio Giorgio told a Microsoft AI Summit in Melbourne.
“The richness is in the commentary.”
An example of the comments could be that the carpark was busy and that the store was out-of-stock of a wanted item.
But operations and sustainability general manager Richard Walker noted that the retailer typically had a hard time finding actionable insights from within the comments.
“By its inherent nature and complexity, it was always incredibly difficult to do anything with that [data] quickly,” he said.
“You’d need an army of people on a Monday morning to try and discern anything meaningful from the feedback, so we didn’t.
“When we did decide to try it, we went down the traditional route of word clouds and we invested three-to-four months in there, [but] whilst they look great and it was a different lens on performance for the week, it didnt give us anything that was particularly actionable.”
The job is now a production use case for generative AI, which Coles runs out of its Azure cloud tenancy.
“This is a wonderful use case for GenAI,” Giorgio said.
“[We] feed all the customer verbatims and the contextual scores into a GenAI engine, and the idea is to have the GenAI summarise the top three things that each of the stores should focus on.”
That is a slight over-simplification of the process – or at least belies the effort behind-the-scenes in training the GenAI model in the operational structure and workings of Coles.
“Putting something into a GenAI engine and waiting for a response [involves] the quickest amount of time, effort and energy, [but] what we found was we were getting these summaries back that didn’t make a lot of sense,” Giorgio said.
“The challenge with … GenAI is that it’s like a small child. You don’t know exactly what they’re going to say and sometimes it lacks a lot of context and isn’t appropriate at the time, and there’s a little bit of that.
“So, it’s not as simple as feeding data into a GenAI engine. You actually have to train it and architect the data in an appropriate way, so [it knows] the component parts of the different elements of the supermarket: what a store and product hierarchy is, [that] not all stores have the same products, [and] how teams are structured.
“That teaching is actually in the structure of the data, and that actually takes the most amount of time. I think we’ve broken the back of that to a large degree, but there’s still more to go.”
Initially, the scores and comments from the customer surveys have been uploaded manually to the GenAI service, but this will move to an automatic process down the track.
The goal is to be alerted to patterns in customer feedback that can be easily fixed by tweaking operational settings such as staff rostering.
“What we’re working with Richard on, and Richard is advocating, is if we get some negative feedback around customer service, say the number of checkouts that are open, how do we marry that up to the roster of how many people we had on at that time so we can determine well actually does that mean we need more people, and is the feedback consistent enough that we do need to change the roster?” Giorgio said.
“That’s the reasonable example of how we need to evolve this type of technology.
“It’s exciting but there’s a lot of ‘grunt work’ that needs to go into it, although I expect that in the future it will become easier.”
Walker said applying GenAI to the problem was a “game-changer”, not just because it solved an otherwise personnel-intensive analysis challenge, but also because it exposed the full richness of data that was being collected but not fully acted upon.
“We physically could not do what we’re now doing. We [would have] needed an army of people to digest and synthesise the comments,” he said.
“We’ve taken these 40,000 comments each week, and for those that we can gain utility from, we are inferring about twice as many pieces of sentiment.
“So, in a whole sentence you can infer two-to-three different pieces of sentiment around the shopping experience.”
The comments are then classified according to “30-to-40 categories of sentiment” to aid in understanding where operationally needs the most attention.
“What we can then do is … see what customers talk about most, so you can see the relative participation of that sentiment topic within a given week, and you can then trend that sentiment over time to see an amplification based on things that we do,” Walker said.
“Where it gets really amazing is you can then overlay attributes outside of the sentiment in terms of store type, location, hour of day.
“You [can then] get very granular and focused outputs which from a finance perspective then allows the action that comes on the back of it to be hugely targeted.
“That ability to be much more focused is a game changer.”