At WWDC 2024, AI could make Siri the voice assistant Apple always wanted

When Apple first introduced Siri in 2011 alongside the iPhone 4S, the company created a series of very compelling ads showing how you could use this new voice assistant thing. In one, Zooey Deschanel asks her phone about a tomato soup delivery; in another, John Malkovich asks for some existential life advice. There’s also one with Martin Scorsese shuffling his schedule from the back of a cab in New York. They displayed reminders, weather, alarms and more. The point of the ads was that Siri was a helpful and constant companion who could handle anything you needed. No apps or taps required. I’m just asking.

Siri was a big deal for Apple. At the 4S launch event, Apple’s Phil Schiller said that Siri is the best feature of the new device. “For decades, technologists have teased us with this dream that you’ll be able to talk to technology and it will do things for us,” he said. “But that will never come true!” All we really want to do, he said, is talk to our device in any way we can and get information and help. In a moment of classic Apple bravado, Schiller declared that Apple had solved it.

Apple didn’t fix it. In the 13 years since it first launched, Siri has become either a way for most people to set timers or a useless feature to be avoided at all costs. Siri has been bad for a long time, long enough that for years it seemed like Apple either forgot about it or just decided to pretend it didn’t exist.

But next week at WWDC, if the rumors and reports are true, we might get our first look at the real Siri – or at least something much closer. According to Bloomberg, The New York Times, and more, Apple will unveil a massive overhaul of the assistant that will make Siri more reliable with large language models, but without many new features. That would be a win too. However, Apple also seems to be working on a version of Siri that actually integrates into apps, and it may be almost ready to launch, meaning the assistant can perform actions on your device on your behalf. At least in theory, anything you can do on your phone, Siri can soon do for you.

This was apparently Siri’s vision all along. You can even see it in those iPhone 4S commercials: these celebrities ask Siri for help, and Siri almost never gets the job done. She gives Deschanel a list of restaurants that mention delivery, but don’t offer to order anything or show her a menu. He tells Scorsese there’s traffic, but doesn’t redirect him—and shouldn’t he already know he’s going to be late for the meeting? Siri tells Malkovich to be nice to people and read a good book, but offers no practical help. So far, using Siri is a virtual assistant whose only job is to do Google things for you. Which is something! But it’s not much.

Siri’s inabilities were all the more frustrating because all she needs to be useful is right on your phone. When I want pizza, why can’t Siri check my email for the receipt from the last order, open DoorDash, place the same order, pay with one of the cards in my Apple Wallet, and be done? If I’m having a Scorsese-level busy day, Siri seems to be right next to all my contacts, my Slack, my email, and everything else she needs to quickly move things on my behalf. If Siri could take over my phone as one of those remote access tools that lets someone else move your computer’s cursor, it would be unstoppable.

There are two reasons why Siri has never lived up to its potential in this way. The first is simple: the underlying technology was not good enough. If you’ve used Siri, you know how often it mishears names, misunderstands commands, and reverts to “here’s something I found on the web” when all you wanted to do was play a podcast. This is where large language models are clearly very exciting, as we’ve seen how much better speech-to-text tools like Whisper are and how much more widely these models can understand language. They’re not perfect, but they’re a huge improvement over what we had before — which is also why Amazon is converting Alexa to LLM and Google Assistant is being overrun by Gemini.

The second reason why Siri never worked is simply that neither Apple nor third-party developers ever figured out how to should work. How are you supposed to know what Siri can do or how to ask? How should developers integrate Siri? Even now, if you want to add a task to your to-do list app, Siri can’t just figure out which app you’re using. you have to say Hey Siri, remind me to water the grass in Todoist, which is a weird sentence that doesn’t make sense and in my experience fails half the time anyway. If you want to do a multi-step action, your only option is to dabble in Shortcuts, which is a very powerful tool, but not enough to make you write code. It’s too much for most people.

AI could also give Apple a chance to end the whole problem. Its researchers published a paper earlier this year describing a system called Ferret-UI that uses an AI model to understand small details of the image on the screen. The researchers even detailed how an overall app using Siri might work: GPT-4 OpenAI does a good job of broadly understanding what an image is, and then Ferret is able to understand small areas and details. In practice, this may mean that one system says, “This is the Ticketmaster app!” and the other says, “There’s a buy button.”

We should be skeptical of what Apple claims about Siri. More than a decade ago, Schiller stood on stage and declared that Apple had created a better voice assistant, and that didn’t happen. The same could be true now, as the hype for AI continues to evolve much faster than the actual technology. Humane, Rabbit, Google and others are working on similar ideas — “agent” has been a buzzword for years in the AI ​​world — and no one has yet proven they’re ready.

But if Apple has cracked anything here, it could be the first time we see the real Siri — the Siri we were promised all those years ago. Maybe in the next ad, Deschanel’s tomato soup magically appears at her house and the Headspace app starts up to bring Malkovich some inner peace. Maybe we’ll finally get the Siri that Apple always wanted to create.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top