Imagine AI that can truly understand and interpret human conversations

File picture: Pexels

File picture: Pexels

Published Jun 6, 2021

Share

AI can do a passable job transcribing what one person says. Add multiple voices and tangents, things get a lot murkier.

Imagine holding a meeting about a new product release, after which AI analyses the discussion and creates a personalised list of action items for each participant. Or talking with your doctor about a diagnosis and then having an algorithm deliver a summary of your treatment plan based on the conversation. Tools like these can be a big boost given that people typically recall less than 20% of the ideas presented in a conversation just five minutes later. In healthcare, for instance, research shows that patients forget between 40% and 80% of what their doctors tell them very shortly after a visit.

You might think that AI is ready to step into the role of serving as secretary for your next important meeting. After all, Alexa, Siri, and other voice assistants can already schedule meetings, respond to requests, and set up reminders. Impressive as today’s voice assistants and speech recognition software might be, however, developing AI that can track discussions between multiple people and understand their content and meaning presents a whole new level of challenge.

Free-flowing conversations involving multiple people are much messier than a command from a single person spoken directly to a voice assistant. In a conversation with Alexa, there is usually only one speaker for the AI to track and it receives instant feedback when it interprets something incorrectly. In natural human conversations, different accents, interruptions, overlapping speech, false starts, and filler words like “umm” and “okay” all make it harder for an algorithm to track the discussion correctly. These human speech habits and our tendency to bounce from topic to topic also make it significantly more difficult for an AI to understand the conversation and summarise it appropriately.

Say a meeting progresses from discussing a product launch to debating project roles, with an interlude about the meeting snacks provided by a restaurant that recently opened nearby. An AI must follow the wide-ranging conversation, accurately segment it into different topics, pick out the speech that’s relevant to each of those topics, and understand what it all means. Otherwise, “Visit the restaurant next door” might be the first item in your post-meeting to-do list.

Giving AI a seat at the table in meetings and customer interactions could dramatically improve productivity at companies around the world. Otter.ai is using AI’s language capabilities to transcribe and annotate meetings, something that will be increasingly valuable as remote work continues to grow. Chorus is building algorithms that can analyse how conversations with customers and clients drive companies’ performance and make recommendations for improving interactions with customers.

Read the full article and more content on Fast Company SA.

Related Topics: