Image of blue stylized soundwave on deep purple background
An arrow pointing leftHome

Hands-on with Yoodli, the gamified AI designed to improve public speaking

  • Mike Pearl
5/25/2022

Don’t think of it as talking to a robot. It’s more like practicing with what the founder calls a “smart mirror.”

The language-learning app Duolingo has demonstrated that language skills you pick up from heavy use can be the equivalent of a handful of college courses, at least according to its own research. Not bad for an introduction to an unfamiliar language. But what if a similarly casual, gamified app could improve the English you’re already fluent in? Or even help native English speakers get over their hatred of their own voices?

Yoodli is a Seattle-based tech startup funded by Madrona Venture Group and the Paul Allen Institute for Artificial Intelligence (which is also one of the backers of PNW.ai), and it wants to use AI to improve your speech. For branding purposes, it aims to be thought of as a public speaking aid. The headline for one of Yoodli’s own press releases says it “applies AI to life’s biggest stress: public speaking.” However, formal public speech-making is just one aspect of Yoodli’s much broader value proposition. It’s shaping up to be more like an addictive app that will help you to generally, like, talk, y’know, just, kinda better?

Yoodli co-founder and CEO Varun Puri’s confident-but-comfortable, and completely um-less, speech pattern is impossible to miss when you talk to him, but this was a hard-fought skill, he told me. “I grew up in India, and when we would have these classroom presentations, inevitably someone would have a panic attack,” Puri said. “Speaking up isn’t part of our culture. We are focused on math and science.”

The idea for Yoodli came later, when Puri came to the US, and saw many people here struggling to make themselves understood, too — particularly the type who put “lidars and cars and rovers on Mars.” In college, Puri said, “I would see the same smart kids design the slides, but then the loud extrovert would give the presentation and get an A. And I was like, that just seems unfair.”

So Yoodli, which is currently available as an in-browser beta that anyone can try out, was created to help the introverts and mumblers, not just those who want to give speeches in a language that’s not their mother tongue. To get better at communicating, Puri said, “You’ve gotta record yourself, and watch yourself, and cringe. But the cringing process is painful.” The idea is to ease that pain by giving you someone to talk to, someone who is going to give you feedback that might be brutal, but who can’t laugh at you or judge you — because they’re an AI.

The current platform has two basic functions: There are simple games designed to get you talking about, well, sometimes next to nothing. “Here’s what’s in my backpack” is one prompt. But the point is just to get you to say something. Then there’s a robust speech analysis tool that processes your recorded audio and provides 11 types of feedback.

The games are indeed painful, at least at first. One exercise called “No Filler” simply asks you to speak about a throwaway prompt (“What’s your favorite even number and why?” for instance) for anywhere from 30 seconds to three minutes without any filler words. That means no “uh,” “um,” or “y’know” allowed. If you’re a filler word addict, you’ll feel the burn instantaneously. It was deceptively easy to talk without filler words, but I certainly couldn’t say anything even remotely interesting, and instead found myself sounding stilted, theatrical and not at all myself. It took me a few tries to put any emotion or emphasis into my answer at all, and it was only once I became invested in the point I was making that the “ums” and “uhs” returned with a vengeance, which was a useful lesson.

But where Yoodli’s beta really shines is in its AI-powered speech analysis, which can digest longer speeches, and tell users not just about their overuse of filler words, but about things like how fast they’re speaking compared to others, whether they’re garbling their words, whether they’re hedging — saying things like “just” or “sort of,” as well us using meaningless intensifiers like “totally” and “absolutely” — or whether they’re repeating themselves.

Personally, I may overuse filler words, but I’m also an extrovert who loves talking, which means even my early efforts earned mostly passing grades. But I can see how Yoodli would provide useful baby steps for someone who feels hesitant to open their mouth in public at all.

For instance, receiving feedback like “Top Keywords” — which tells you what topic you were mostly just talking about — may seem like a useless no-brainer on the face of it. But it’s a fascinating and genuinely illuminating analytical tool when you plug in a video of, say, an elementary schooler. Young minds have a tendency to wander and take strange detours, and “Top Keywords” are a good way to demonstrate that.

Intriguingly, users also receive feedback on whether they’re using non-inclusive language, like saying “guys” when they could just as easily say “folks.” Deeper analysis could point users not only toward kinder ways of expressing themselves, but away from common faulty assumptions or lazy intellectual shortcuts that show up often in speech.

The user’s “Eye Contact” also gets analyzed when they use Yoodli, though at the moment, that seems to just tell the user whether or not they were looking straight into their webcam. But your posture, your hand gestures, your facial expressions, and whether or not you dressed for the occasion are all features of communication that we “listen” to with our eyes. Yoodli’s visual analysis of speeches could prove invaluable if it takes some of these things into account, and Puri says it will in the future. “It’s still early!” he told me.

Indeed, according to Puri, Yoodli’s tech is an ambitious combination of computer vision, audio analysis and natural language processing — a team of neural networks, in other words, watching you, listening to the sound of your voice and dissecting the content of what you say. There’s clearly a huge amount of potential utility in all that analysis.

“The goal,” Puri explained, “is for this to be your personalized communication coach.”

His pitch conjures an app that you can use throughout the day, unobtrusively, then check in before bedtime to see how you did, a bit like looking at a fitness app to see how many steps you took. “Anytime you are speaking, you’re preparing to speak, or when you’re done speaking we should be able to tell you, You’re changing your pitch when you’re talking to a man versus a woman, or you seemed way more nervous in your team meeting than you were in your one on one with your colleague.”

Puri knows that any time you’re building a business around getting people to talk to an AI, it can sound a little cold and inhumane. But ultimately, he says, he’s trying to facilitate better communication between human beings, which is about “being spontaneous and having authenticity.”

“It’s practice on your own with AI,” he said. “Think of us as a smart mirror.”