Conversational maxims and the cooperative principle in voice interface designMay 11, 2021
This is part of a series of articles accompanying my new book about voice content and how to build content-driven voice interfaces: Voice Content and Usability, A Book Apart’s first-ever voice design book. Check out what’s in the book and sign up for preorders.
No one knows precisely when human language joined the pantheon of cultural mannerisms that precipitated the behavioral modernity characterizing our species today. It’s one of the most intractable problems in science, because unlike the control of fire by early humans or the invention of cooking and toolmaking, the earliest gestures and utterances have left no trace and no contrails, neither in written literature, nor in the fossil record. It may be impossible for us ever to know whether we began to speak because we wanted to imitate animal communication, emphasize physical gestures, allay the tedium of metronomic tasks, or any number of other potential catalysts.
What we do know is that people have been yammering away, chattering and gesticulating, for tens or even hundreds of thousands of years to convey information, conduct transactions, and contribute to a collective oral history passed down from generation to generation: a message in a bottle that grows inch by inch as new myths and traditions emerge. Nonetheless, the true origins of why we began speaking in the first place may forever remain shrouded in the memory of oblivion. We may never know how it came to happen that modern members of the human species employ no less than six thousand natural languages—spoken, written, and signed.
In this article, I explore two of the most foundational concepts in conversational design and voice interface design: the cooperative principle and conversational maxims, which describe how spoken conversation between humans operates at a fundamental level. Not only do these principles outline how conversations work in the real world—like the ones we have at the supermarket checkout line or with a gate agent at the airport—they also are deeply instructive for how we should design voice interfaces to be more human than artificial. It’s a tall order for interfaces that are computers at their core but that need to be as human as living, breathing, you and me.
Why voice content is hard to design for
Though linguists and paleoanthropologists must look to shreds of fleeting evidence to perform comparative reconstructions or understand hominin vocal organs, historians are lucky to enjoy a much better understanding of the origins of language, not in humans, but in the machines we use on a daily basis. The first conversational interfaces surfaced during the early days of computing, but voice interfaces, also called voice user interfaces (VUI), first entered the cultural conversation in the early 1990s.
But voice interfaces still represent a journey riddled with obstacles and challenges, because for the last two decades, we’ve by and large been operating within the confines of visual interfaces. Building a voice interface for users accustomed to physical and visual interaction with their devices introduces unique aural and verbal challenges to web-biased disciplines like content strategy, design, information architecture, and usability testing.
Add voice content to the mix, and there’s even less of an existing foundation to rely on. Whereas voice interfaces have long conducted transactional, goal-oriented conversations, they’re less commonly enlisted for content-centric experiences that focus on delivering copy to the user, much as a content-driven website would. After all, building a voice interface for your own content is worlds removed from how we have envisaged content-first experiences until now.
Last week, I announced my new book Voice Content and Usability from A Book Apart, which is their first title on voice interface design and the first-ever book about voice content, covering key ideas like voice content design and voice content strategy. Voice content is difficult to design for because of the challenges of migrating usually textual content to a format it was never authored for in the first place. Nevertheless, thinking about our content strategy from the standpoint of conversational maxims and the cooperative principle can help lay a foundation for all voice content delivery.
Conversational maxims and the cooperative principle
We can look for clues about how machines should conduct conversations thanks to the linguistic realm of pragmatics, concerned with contextual meaning conveyed beyond the spoken word itself. Pragmatics has to do with issues like the marking of utterances, the actual intent unexpressed in spoken words, the belief systems that inform distinct interpretations, and the expression of implications and insinuations.
Paul Grice’s cooperative principle defines the optimal conversation as a series of acts of cooperation—Randy Harris calls them dialogue acts in Voice Interaction Design—aimed toward a favorable conclusion. In the process, Grice also defined conversational maxims that characterize every successful conversation. Later, linguist Robin Lakoff added a fifth maxim, the politeness principle, which represents the agreeability and consensuality of the conversation. Grice’s and Lakoff’s conversational maxims are outlined below.
Quantity: “Just enough information.”
Quality: “Be truthful.”
Relation: “Be relevant.”
Manner: “Be brief, orderly, and unambiguous.”
Politeness: “Be polite.”
Though Grice’s conversational maxims have been influential in a litany of fields, how well do they translate to a context where we’re concerned with a human and a machine in conversation rather than a chat between two living, breathing individuals? After all, we have conversations with computers that are very different from those we have with other people, subtly calibrating our next conversational move based on the limitations of the voice interface. In other words, we don’t see machines as human, and it shows.
Throughout the history of human conversation, since time immemorial, we’ve used strategies rooted in the cooperative principle and conversational maxims to make ourselves understood, though it wasn’t until more recently that names were given to these concepts. Not only can Grice’s and Lakoff’s maxims help us craft better conversational interfaces; they can also help us understand why users react the way they do to uncanny or mechanical conversations that don’t quite get some of these principles right.
Voice content may be difficult to grapple with even in the best of times, but by sticking to the human instincts of conversation, like how we deploy the cooperative principle and conversational maxims, you can ensure your content is ready for voice interfaces and other conversational experiences too. For more insights like these, be sure to sign up for preorders of my new book Voice Content and Usability, and check out what’s in the book.