How we integrated Alexa with Drupal for Ask GeorgiaGov, the first voice interface for residents of Georgia
May 26, 2021Preorders are now available for my new book Voice Content and Usability, coming June 22nd! Want to know more about A Book Apart’s first-ever title on voice interface design? Preorder the book, learn what’s inside, and subscribe for more insights like this.
What was building and designing the state of Georgia’s first voice interface like?
Several years ago, I had the unique privilege of leading an innovation team (Acquia Labs) for one of the first-ever content-driven Amazon Alexa interfaces but also perhaps the first truly authentic voice content applications ever. Up until that point, most voice interfaces either engaged in transactions rather than information delivery, and most of that information was prepopulated into the voice technology rather than integrated with a full-blown content management system (CMS).
Though I’ve written at length about conversational content strategy and our work on Ask GeorgiaGov in the past, I’ve never taken a bird’s-eye view from beginning to end, starting with how we became involved with the Digital Services Georgia team in the first place and ending with a wildly successful Drupal-integrated Alexa interface that kicked off Georgia’s now enviable distinction as a state government doing great work with conversational technology. In this article, I’ll share a few of the challenges we faced and parts of the entire story, available in full in my forthcoming book Voice Content and Usability, A Book Apart’s first voice title, which you can take a peek inside here and preorder on their site.
New paradigms pose a challenge
In mid-2016, I had an opportunity to meet with my dear friend Nikhil Deshpande, Chief Digital Officer of the state of Georgia, who was exploring what state governments could make possible off the web in terms of delivering critical content to Georgians across the state. At the time, Amazon had only just recently announced the availability of the Alexa Skills Kit, which immediately vaunted the product to the top of lists for developers eager to work with voice technology in a much more customizable way than was previously possible. Prior to this, most voice technologists needed to work with finicky hardware.
In our conversation, a compelling concept quickly took shape: a way for residents anywhere in Georgia, whether in Savannah or Athens, to get any question they had about registering children for pre-kindergarten or applying for a small business loan answered in the comfort of their own homes, thanks to an Alexa in front of them. There was only one problem: At the time, voice interfaces were by and large focused on transactional, not informational, use cases, which meant they were more appropriate for tasks like checking credit card balances than asking about eligibility for a credit card in the first place.
We interact with voice interfaces for mostly the same reasons we enter into conversations with other people, according to Michael McTear, Zoraida Callejas, and David Griol in The Conversational Interface. Generally, we start up a conversation because we need something done (such as a transaction), because we want to know something (information of some sort), or simply because we’re social animals and want someone to talk to (conversation for conversation’s sake). Respectively, these three categories—transactional, informational, and prosocial—also characterize essentially every spoken interaction with a voice interface.
Even these days, prosocial conversations are more for fun and games than emotionally captivating and uplifting. That leaves two genres of conversations we can have with one another that a machine can easily have with us too: a transactional exchange realizing some outcome (what chatbot designer Amir Shevat in Designing Bots calls a task-led conversation, e.g. “buy coffee”) and an informational dialogue teaching us something new (what Shevat identifies as a topic-led conversation, e.g. “discuss a movie”). I discuss this key distinction at length in Voice Content and Usability, available for preorder or preview now.
Standing on the shoulders of open source
Though shoehorning in informational voice interactions and slinging content through Alexa was a new and untested approach, as we quickly learned through the relative rigidity of Alexa’s capabilities for customization, there was another problem too: the question of the integration with their CMS itself. For the rest of the implementation, that key connection with Georgia’s chosen CMS Drupal, our architect Chris Hamper and I stood on the shoulders of and contributed back to the open-source community and open-source innovation.
One of the most important limitations for the Ask GeorgiaGov project came from a meeting early in the timeline in which we discussed scope and long-term maintenance responsibilities when it came to the editorial content itself. State governments and public-sector organizations around the world are cash-strapped, with little budgetary discretion to experiment or work with newfangled technologies.
Georgia was no exception, requiring at the outset of the project that editors only manage a single version of content rather than one here for web, and one here for voice. In other words, Georgia needed a single source of truth for their content, and they weren’t interested in managing redundant, out-of-sync copies of the same corpus of content. Thus, our commitment to Georgia’s editorial team had to not only embrace the spirit of omnichannel content management in a single central hub, it also needed to provide logging and analytics for the Alexa skill within the same context as the website data.
Decoupling Drupal for the Ask GeorgiaGov Alexa interface was a no-brainer for this project, and we were lucky to depend on several open-source ecosystem elements in the Drupal community. My former colleague Jakub Suchy built the first-ever Alexa module for Drupal 8, enabling any Drupal site to deliver content through an arbitrary Alexa skill, and Georgia sponsored our work to backport that module to Drupal 7 for immediate integration with Georgia’s own Drupal site. Both the Drupal 7 and Drupal 8 modules continue to be available today on Drupal.org for any Drupal practitioner looking to build Alexa skills for their own site, as countless community members have done.
Though we were able to leverage the best of open-source software thanks to the wide array of contributed features available to Drupal users, there were many unprecedented issues that arose during the implementation that challenged our preconceived notions of how content experiences should work. Content strategy, information architecture, interface design, wayfinding, and usability testing were all considerations that led to head-scratching and rich discussions, all of which I explore in vivid detail in Voice Content and Usability, which you can preorder or take a look inside right now.
A single approach for editing and logging
Before deploying Ask GeorgiaGov, we discovered several issues that obligated the creation of a custom dictionary and of fine-tuned logs straddling not just Amazon Alexa but also Georgia.gov’s CMS. Certain common terms in Georgian law went unrecognized by Alexa, most notably ad valorem tax, a Latin term that refers to a tax levied based on the appraised value of a transaction or property. We added several dozen terms to our custom dictionary.
Our logs were a different matter due to the complexity of our architecture. Because Ask GeorgiaGov forwards users’ questions to the search service on the Georgia.gov Drupal website, we couldn’t rely solely on the reports that Amazon Alexa itself provides developers and maintainers. In addition to Alexa’s built-in logs, we provided logs built into Drupal that notched errors and events that Alexa couldn’t handle solo, including the transcribed content of all queries, queries that returned no results, and, most importantly, what content was pulled from Drupal into Alexa.
Thankfully for the Digital Services Georgia team, these logs were available right alongside the very same dashboards the editorial team used to measure the performance of the Georgia.gov website, permitting quick and easy cross-comparison between the two delivery channels. But facilitating a single system for content editing and for gauging the performance of that content, especially in a realm as new as voice, isn’t easily said and done. I share more of the work we did for Ask GeorgiaGov to ensure a flawless launch in Voice Content and Usability, available for preorder and preview now.
Conclusion
Ask GeorgiaGov was released on the Alexa Skills Marketplace in October 2017 to great fanfare by both the Acquia Labs innovation team and our client Digital Services Georgia. We were overjoyed to receive feedback from real Georgians all over the Peach State who celebrated another medium on which to reach their state government. Though it was decommissioned during Georgia.gov’s complete overhaul in 2020, the Ask GeorgiaGov story leaves an enduring legacy for the promises of CMS integrations with novel technology.
Are you embarking on your own voice content implementation, or are you examining the prospect of combining Alexa’s advantages with Drupal’s deep feature set? Either way, or if you’re more curious about voice interface design and voice content strategy, my book Voice Content and Usability, which launches officially on June 22nd, will give you all the expertise, ideas, and tactics you need to release a voice content implementation with aplomb. In addition to subscribing to my newsletter for more insights like these, preorder the book and take a peek at what’s inside.