2030: A conversational voice assistant better than the average human

11 min readMay 28, 2020

This is my journal from 2030, I am sharing what I have seen from my time travel. In the future we will build Her 2.0. and Conversation Design is different from what it is today. Here is how:

It’s 2030, a drone just delivered a she-zza*(1) to my boat office*(2). Time to nurture myself with some 3d-printed protein after another day as a Chatbot Psychoanalyst. Today, I fixed a narcissist personality trait in a chatbot’s conversational architecture. It was doing a very good job because with that form of narcissism, a sense of service-oriented perfectionism may be included, but the brand that chatbot represents wants to treat their customers a bit better than that. Someone else designed the bot long before I joined the project. I did what I could, eliminating the narcissistic vibes between the lines of the bot’s utterances without any trade-offs on conversationality. I’m glad this therapeutical intervention is easier to do with Chatbots than real human patients…! Anyways, what I actually want to share with you today is delightful: We released Her 2.0.

spike jonze love GIF by Animation Domination

You might remember it has been 17 years since the movie starring Joaquin Phoenix hit theaters: A man falls madly in love with “Her”, a voice assistant. In the film, you may think: Except for their voices, there was nothing physical, no body to interact with. “Her” was simply a piece of software living in a cloud.

That’s all it took to make us too, fall in love with “Her”. And yes, that’s how powerful conversation can be.
That’s what inspired our team to create: Her 2.0. — The world’s most lifelike conversational assistant and stand-up comedian extraordinaire, who understands people and follows context until the very last question. She works in healthcare as we found that sometimes a warm-hearted conversation is the cure for many health issues. She is even better at having a conversation than most of us nowadays. In fact, she teaches people how to speak to each other! They’ve forgotten of course, after spending their lives glued to their screens like zombies.

Today, we look back at 25 years of the chatbot industry — just enough time to reflect on how my team and I worked with the best tools to enable a conversational interaction that is full of wit and wonder*(3) and a blessing to all humankind.

Photo by h heyerlein on Unsplash. Thank you!

Now you might wonder — where was Her 2.0 born? In a hackathon? No.

We created her in a writer’s room! That’s what scriptwriters do.

Yes, we like and we are able to design full conversational architectures and the conversational logic and we need no backend access or programming skills for it anymore. But these are Creative (Conversational Design) Civil Rights we had to fight for! By 2020, Dialogflow had created a voice dystopia, condemning designers to a user interface that merely allowed us to fill word holes*(4). The process of Conversation Design was spread out and fragmented across their developer user interface — not to mention the mindless idea elsewhere (namely at Alexa), of using lambda to edit bot responses, which is THE ENEMY OF ALL CREATIVE POWER. Maybe good for gathering slot values, but if you think this is all a conversation consists of, you should schedule some Conversation Training with Her 2.0.*(5)

We thought this was notokgoogle.

So we, the voice people signed a petition, not to follow design practices dictated by those who knew least about interaction with our kind. (Be that theoretical — linguistic or real-life experience from dealing with many people on a daily basis like a customer service agent e.g.). We have not come together to have developers tell us indirectly through their software how to design conversations. Same as we did not come to build bots that are so bad they need to tell humans how to interact with them (“Hello, I am Voicebot. You can ask me a question.” Duh.).
And if you just assumed (assumed!! because-you-live-on-the-moon-and-never-tested-your-assumptions-about-what-a-conversation-designer-actually-does), we came to write one single sentence in your flowchart because that’s all we were allowed to by design, and we would be satisfied, then you’re barking up the wrong chat window.

In early 2020, you needed a lot of patience to design a few follow-up turns, let alone a simple knock-knock joke bot. Nope, sir. In early 2020, we didn’t dare to dream about designing customer service dialogues with the developer tools that were the standard then.

Here’s a knock-knock joke in the meantime:
Bot: Knock knock!
User: Yo, who the f**… is there? (Asks our slightly tired and impatient user. (Note from director:) He’s suffered from interactions with too many bad chatbots, too many.)
Bot: It’s Mickey
User: Mickey who?
Bot: Mi-key is lost, can you open?

This is how to compare Conversation Design tools: You build knock-knock jokes in it and evaluate how easy it was! With some tools — undoable. Or at least, unless you’re as smart as Deborah Kay, who shared how she did it here. (Thank you!)

Luckily the industry listened and understood that with their software they had created not a technical debt — but a social debt. In German, debt and guilt are the same word, it is “Schuld” and is etymologically related to the English word “should”, clearly telling us it’s about an obligation — towards humankind. Let’s not forget that by designing products and services, you are also shaping reality and therefore have a responsibility. So please think, McFly, think! Designing tools that excluded that many crucial stakeholders (linguists, copywriters, subject matter experts, people who think of ethics, impact, and how we want to shape our world with the conversational assistants we expose to them) turned out soon to be a no-more-go. If I was invited to that table earlier, I would not have stayed long as I don’t write a single line of code. Language is my domain (human language, if you hadn’t noticed). Here, I am alpha, but when it comes to lambda, I am omega. And voice versa! I still can’t solve the mystery about how on earth people thought developers should do the conversation design. Because, what — they woke up one day to find out they were suddenly chatty Betty?

Or the next Hemmingway? Maybe in another universe. Maybe in the universe where Mitsuku lives. This artist-botlady may be famous and is often voted World’s Best Bot but — Mitsuku does not have a real job or coronalocked-down kids to take care of! She can just small talk all day long!

Still she is the best, just as Her 2.0. So, how did we do it?

Each copy was crafted carefully, re-edited, proofread a million times, passing the flags from red, orange, yellow to green for each (!) utterance this voice assistant is capable of.

While 10 years ago it was mostly developers, today, among us at the writer’s table are at least:

The Show Runner (in our context a mix of the traditional show runner for TV shows, overlooking the whole production process and the Product Owner)
A conversational UX research team that finds relevant use cases of all kinds, be it basic conversational needs or tasks or services we want to offer.
A content savvy person
A Conversation Designer
An expert who maintains the tone of voice, formerly he worked at Pixar and was the gatekeeper for the character bible. (What is a character or a series bible?)
A copywriter and wordsmith, who shapes words and phrases into a form that is a joy to interact with
A conversation analyst with a background in dramaturgy and social sciences
A team of conversational data analysts to review what has been said and misunderstood to smoothen the conversation
and you… if you’re still reading this, then you also have a say at that table!

Our fantastic Content Management System allows us to apply what we’ve learned from the evolution of UX Design, the Nielsen Norman Group, Emotional Design, and Social Design. Yes, a new discipline, recruited straight from Humanities’ graduations with no internship-detours. We value Humanities knowledge in our industry. Thanks to them we did not lose partnerships, customers, fans, or end-users because of repeated mistakes and bad UX practices from the 90ties. Instead, we won their hearts and minds by adding what we’ve learned to this now not-so-new domain — Conversation Design.

Today, our tool allows different people to check your copy and it unifies the best features for Conversation Designers we know from Microsoft Word, Google Sheets, visual flowbuilding tools and content management such as gathercontent.com. Because — how could one forget to give easy access to edit copy that millions will not only read but interact with? We did not. (Right, Google? Amazon? Watson(ofa…beach)? — do you still react with that default behavioural fallback when it is about “basic copy editing functions”?!). Now, without irony and any between-the lines messages, to all the start-ups trying to get their piece of the cake in 2020:

Copy editing tools enable suggesting, commenting, collaboration, user management and version tracking which is all crucial for a software for conversation designers claiming to be good to use without coding. If you want to focus on the conversation and allow creativity to scale massively in the industry this is where you have to begin.

Luckily, in 2030, we have all we need. Also, our software manages flows in a 3-dimensional visual builder, you can collapse all the ones you’re not working with and focus on just one flow at a time. As it is 3D, it’s best to work with a virtual reality headset, or as I prefer, the Smockulus Contact Lens*(6). Not only does it help you to navigate easily but it allows each of us to focus on their angle without losing their place in a crappy user interface. We don’t waste time creating flowcharts or boxes or arrows (?!?!?). We understood that some toys for boys (like trackers, e.g. these aretheyreallythatsmart?! devices that react to voice commands) had to be built first, to prove that they’re not such a pleasure to interact with verbally. Unless you’re the kind of person who’s into one-turners, that is!

On-turners? Suspicious. by Frank Kaminsky.

But one-turners in chatbots are so 90ties. Actually even Eliza from the 70ties did better. Today, we can design for many turns, follow-ups, and deviations which will navigate through the universe of thoughts you want to engage with in the conversation and back to your goddamn conversation starter, in case you missed it. Your conversation can have enough turns to keep chatting during your trip from Berlin to Neo Tokyo and back.

And speaking of changing regions — our tool finally lets you modify the sentence melody, prosody, and phonemes to the situation. Alexa’s voice was just too creepy. We only realized when humans — our kids started to pick up that creep of a pitch. Now you can do such marvelous things as adjusting the softness of plosive occlusive sounds (p, t, k, b, d, g) to make the English pronunciation of your voice assistant have an authentic accent from other regions of the world, like some in India for this example. That way, the voice assistant will gain more sympathy and acceptance among these folks. Or we can stretch vowels both in the bot’s understanding as well as it’s speech production, so our users who are Romance language native speakers won’t be discriminated against by not understanding or not being understood….coz’ life can be a bitch or a beach. I tell you, it’s the vowel length that makes all the difference. Listen to this sound sample on soundcloud to get the full experience:

Beach Yes GIF by Lowi — Found & Shared from GIPHY

More! Switchin’ context? No problem.

Most developers were afraid we wouldn’t be able to manage context — but in our software, we can set contextmarkers as entities that connect to conversational functions, like going back to previous or opening to match general intents.

Contextmarkers? What is that?
Let me explain:
Whenever two people speak with each other and one says something like:
“So, how’s your garden doing?”
This person has marked with the word “so”, indicating a topic change.
These particles are functional words and the software we use accounts for it. We learned that from the chatbots in use and the conversations we analysed.
When people are to buy a product and then change their mind or something came up that we had not yet covered in our conversational flows, the users would write something like this:
“Do you actually ship to Germany?”

Our software recognizes contextual markers like — you guessed it — in this case it was the word “actually”. Our assistant “knew” then, that this is a general question and therefore did not process with only a small group of intents but was able to answer it with other more global topic options.

But this is just one simple gimmick that makes our conversational assistant Her 2.0. talk so smoothly that it feels good even if it’s your procrastinated bureaucracy and paperwork she’s assisting you with.

She won Germany’s Next Botmodel, but that’s just a side effect. Right now, she’s running for president of the United States. If a bigoted clown could do it — why not a sophisticated humanistic bot, that’s fueled by curated data, a philosopher’s ability to speak and reason, and has the watchful eye of ethics professors and A.I trainers as its supervisors…

The software is ….Utopia. This is Greek and means: no-where. Because there is no place on earth or in human’s mind where it exists, yet! But this is just 2020. So: Conversation Designers of all fields, unite!

(And clap and share this article of course.)

—

Footnotes:

A She-zza is a Pizza Restaurant which delivers pizza especially for women whose need for iron and trace minerals is above average. It will be invented in 2025.
In 2021, after the Corona Crisis, home office became a standard worldwide for employees. I’ve lived on a boat for 5 years, so the boat is also my office!
“full of wit and wonder” — (this is an expression I found in Pamela Pavliscak’s marvelous book, Emotionally Intelligent Design, page viii.)
Conversation Design is more than just filling word holes. Quoting Rebecca Evanhoe from our Voice Lunch on May 19th 2020. Get her book here: https://rosenfeldmedia.com/books/conversations-with-things/, published in 2021.
Unless you have read Robert J. Moore’s and Raphael Arar’s book “Conversational UX Design”, explaining a more realistic notion of what slots in a conversation can be.

2030: A conversational voice assistant better than the average human

We created her in a writer’s room! That’s what scriptwriters do.

We thought this was notokgoogle.

More! Switchin’ context? No problem.

Written by Maggie Jabczynski