How may I help you?
Conversation Design in the time of COVID-19

“Your call may be recorded for quality assurance purposes.”
My fingers tap lightly on the keyboard. I’ve never been so nervous to speak to a customer service representative. After Hilary* introduces herself, I ask her,
“Hi Hilary. I’m not calling about a Nike product. I’m actually just really curious — how has your day to day changed since COVID?”
Silence.
“Oh!” nervous laughter. “Um, that’s the first time someone has asked me that. Well, I guess this is 3 months staying at home now. I’ve realized all the small things I took for granted…
Before, I’d tap my manager’s shoulder if I had a question. Now I have to chat her, and she’s usually on another call.”
I nod, recognizing the contact center is not just a physical space but serves as a place of support. I ask her, “Has it been busier than before?”
She laughs. “Oh, you don’t even want to know…”
“Really?” we laugh together.
As she speaks, her voice grows in surety. “The second we hang up I’m going straight to another call. The wait time right now is over 30 minutes… I used to be able to take breathers in between my calls, but now it’s back to back. But this has shown me how to type faster, how to enter things faster, and how to be more kind.”
Across the world, COVID has forced us all to re-adjust our lives without warning, a testament to the creativity and adaptability of the human spirit. To wit, corporations such as GM have repositioned their corporate strategy to manufacturing ventilators and new industries such as contact tracing have cropped up. The industries at the brunt of the bottleneck, including on-demand groceries, remote learning, and financial services, have seen an increase in support tickets of up to 133% since late February.

The faces of their frontline, the customer service reps, are caught between a growing demand and shrinking supply.
Incoming calls are off the charts, COVID has forced the closure of all call centers, but many customer service reps don’t have computers or broadband internet access to take calls from their corporate system. And it’s not simply the volume of calls that have spiked; a recent study surveying 1 million customer-service calls found that the distribution of call types has dramatically shifted. In the two weeks following March 11, the percentage of calls scored as “difficult” doubled from 10% to 20%. Such call conversations centered around financial hardship (e.g. furloughs, insurance coverage, mortgage payment extensions, etc).
I remember Hilary, one of the 3 million service agents who can no longer take breathers in between calls. “Now, it’s back to back calls every day.”
If we categorize service calls, the two ends of the bell curve are:
- Easy calls. This may include, “where is my order?”, “is the grocery store still open”, “when is United closing its borders to Europe”, or “does In-n-Out do deliveries?”(they do not).
- Difficult calls. These are, “I can no longer afford my mortgage payments. What can I do?”
Easy calls include repeated common inquiries or programmatic tasks such as password reset (comprising of 20–50% of service calls). The natural hypothesis here is: If we offloaded such tasks to a conversational voice agent, human agents would have time to breathe between calls. They’d have the bandwidth and energy to empathize with customers during these times of anxiety. Our collective wait time in the queue would diminish.
In fact, this is already happening. Many firms are turning toward bots to handle common inquiries. The business value is obvious.
But does greater efficiency equate to a better world?
In a time of monolithic uncertainty and overwhelming sense of unease, are we further thinning the string of human contact by introducing bots?

To answer this question, imagine the attributes that fill you with annoyance when you interact with a non-human on the phone. They speak in long, monotonous increments. When you say, “hey, no, not that,” they ignore you, continuing with their long, monotonous increments. Rather than intuiting your needs, bots force you to (re)articulate your problem. Painstakingly. Slowly. And even then they might not understand. You fear that the system will not be able to solve your problem. That not only will your time be wasted, but so will your breath, when you have to repeat the same information to a human when you are, finally, transferred.
The reason we hate interacting with bots over the phone is not that they fail to provide human connection. It is because they can’t get the job done.
What we need is a greater focus on conversational design
There’s a lot of pieces involved in designing great conversational systems, including ASR/TTS technology, language models, intent and entity extraction, dialogue managers, etc., but the missing focus today is not the technology. It is a deep understanding of the English language and how humans talk.
Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com
To begin designing remotely human-centric AI voice experiences, we need to break down the structure of a conversation. To constrain the problem set, let us scope it to customer service conversations. A service conversation is centered around the fulfillment of a user’s goal. Three things need to be in place:
- Conversational flow: The system’s first job is to identify the goal of the conversation, and move towards resolution in as few turns as possible. Throughout the course of the conversation, the system must identify the type of dialog the user is presenting and respond with its adjacency pair. For example, a question warrants an answer; greeting warrants greeting; complaint warrants acknowledgment. When the system and user each respond with the appropriate pair, a sense of conversational flow arises. The challenge occurs when the dialog is polysemous or has layered meanings. For example, “It’s so hot in here” can both be a statement and an implied request to turn on the AC.

2. Conversational repair: Humans course-correct throughout a conversation so innately it mostly goes unnoticed. For example, “I’m calling to check in on what my … whatchamacallit — debt-to-income-ratio needs to be to qualify for this loan.” In this case, the user initiates self-repairs in their dialogue, avoiding trouble by filling in a forgotten word. This is in contrast to other-repair (“I think you probably meant…”). Conversations are a dance of constant reparation where both parties work together, often subconsciously, to handle any trouble that arises and move the conversation forward. Throughout a conversation, we ask questions, rephrase, clarify, each serving as checkpoints to gauge the need for any repair. Nearly every day, we have conversations where the other person misunderstands us at some point. The particularly loquacious may keep talking before we have a chance to initiate repair. The longer the conversation goes without the repair, the more frustrated we grow — this is precisely why voice systems today leave us wailing, “Agent please!” Voice systems are not built with features that allow for user-initiated repair, which inevitably lead to conversations falling into troublesome paths. A natural dialog system that does not support conversational repair is asking for failure.
3. Conversational context: Think about a conversation you had with a service agent.
Me: “I’m calling to create a new checking account.”
Voice Bot: “Okay, what is your first and last name?”
Me: “Angela Kong.”
Voice Bot: “Okay, we found you in the system. Now, how can I help you?”
Me, internally: Umm I literally already told you?
To achieve the first and the second (conversational repair and context), the system needs to be able to remember the dialog history beyond the current turn and the system’s agenda. Otherwise, the conversation devolves into a desultory mess. Above I mentioned adjacency pairs: a question harkens an answer. But an added complexity is that the user can potentially introduce a different topic at any time, requiring the system to keep track of context and state.
Of course, building a robust natural dialog service requires more than a deep understanding of the complexity of English discourse. But good voice experience has to start with conversational design.
After all, in the voice space, the conversational design is the UX.
Interestingly, COVID has shifted how consumers perceive AI.
A recent study surveying 1,000 Americans found that COVID-19 has increased human comfort & willingness to engage with bots (e.g. chatbots or voice bots), with 45% expressing a preference to speak to AI for shorter wait times and >20% more comfortable having a full conversation with an AI-powered system.
But why now?
It seems counterintuitive that in this time of sparse human-to-human contact, comfort in having a full conversation with an AI-powered system has increased.
But in a time of monolithic uncertainty, what we crave are answers. Psychology research has shown we all share “generalized feelings of apprehension from the unpredictability of ambiguous threats.”
When we don’t have answers to what our world will look like in a month or a year, we naturally seek certainty in the areas where we can. Answers, then — even answers from machines — quell this apprehension.
Well-designed conversational agents are not dehumanizing, rather, they provide clarity to our uncertainty & ballast to our anxiety.
But what about service agents?
The reader who has stayed with me up until this point is probably wondering when I will address the issue of bots “automating away” service agents.
I am here to tell you why the introduction of voice services not only necessitates service agents but allows them to do more gratifying types of work.
Let’s take a look at the data.
Companies that have adopted conversational agents have found that the biggest lift is not in cost savings, but rather an increase in customer satisfaction.
When machines handle common inquiries, human agents are granted greater bandwidth to focus on complex cases, including those requiring human-centric skills like empathy. Talk to any service agent, and I promise you they will not miss the drudgery of answering the same question for the 97th time or spending another hour resetting passwords.
Voice has already made a splash in the consumer world — by the end of this year, over 50% of all searches will be done via voice. In the enterprise, voice has tremendous potential in improving the service industry. We can’t do this without first focusing on conversation design. We can’t do this without considering the impacts on all stakeholders in the ecosystem. But a well-designed and integrated voice AI can be the tide that lifts all boats — corporations, service agents, and end-consumers — when we continue to face uncertainty in our future.
Thank you to Lizzy Rewalt for always pushing me to grow, my friends Amy Shen and Vincent Yang for editing this article, and Greg Bennett who has opened my eyes to conversation design.