What comes after the smartphone?

I don't know, but here are some thoughts on ambient computing form factors

Oct 14, 2024

Over the past few weeks, three stories and releases have got me thinking: Meta revealed their AR glasses prototype, Orion; OpenAI unveiled o1, its new AI model with iterative reasoning that allows it to answer even more complex tasks and solve harder problems than previous models in science, coding, and math; and the New York Times released a fascinating profile of former Apple designer Jony Ive, confirming he's working on an AI hardware device with Sam Altman. These glimpses into our collective technological future have me wondering more and more: What’s the next form factor after our phones?

Woman wearing Orion glasses and looking at a holographic screen — Meta Orion marketing image | Source: Meta

The prevailing answer seems to be some form of augmented reality glasses. Think heads-up displays like Tony Stark’s in Iron Man or Tom Cruise’s in Minority Report. But I’m here to ask: Do smart glasses really make sense? Have we truly explored the best of our creativity, or are we just giving in to the sci-fi fantasies of our youth? I think smart glasses and always-on visual interfaces have three core problems.

First, the form factor. I wear prescription glasses because I have to—I’m blind as a bat without them. But why would you wear glasses if you don’t need them? As Scott Galloway bluntly puts it:

“We’re highly discerning about what we put on our faces, as it must enhance, not impair, our ability to assert dominance, attract mates, and make connections.”

A person working with Apple Vision Pro and their computer. — Apple Vision Pro | Source: Apple

Sure, technological advancement might eventually give us smart glasses that are indistinguishable from regular ones, but we’re nowhere near that right now. The Apple Vision Pro, while one of the coolest tech demos I’ve ever experienced, is a heavy behemoth. Even the Meta Orion prototype looks like the Michelin Man squeezed into a pair of glasses and requires you to carry around a ‘puck’ for computing power and a wristband to allow for controlling the device through your hand movements. Oh, and the Orion glasses? They can’t be mass-produced and cost more than $10K.

Next up: our collective state of notification anxiety and screen-separation. We’re already addicted to our phones and notifications. Go into your screen time settings and check how many times you picked up your phone yesterday (don’t feel bad, my count was 185). Now imagine a world where you can never put your phone down—you're constantly seeing those red dots, how far away your Uber Eats order is, or the play-by-play of a baseball game. Sure, phones already have this isolating potential, but I think there’s a level of social decorum that glasses would completely break. As product designers and engineers think about the ‘next’ device, I hope we can use this as an opportunity to break the cycle of computing anxiety and notification overload. Do we really need to see this information all the time?

Finally, let’s talk about something I’m plainly calling ‘distorted reality.’ We already live in an era of misinformation and polarized thinking, driven by algorithmically optimized newsfeeds that cultivate our own realities. What happens when our physical spaces are augmented with visual overlays that warp our worldviews even more? What’s stopping us from literally living in our own version of reality?

Take these two (exaggerated!) examples:

You’re at a Bills vs. Jets game at MetLife Stadium. The game’s on the line, and the Jets are about to kick a game-winning field goal. If you’re a Jets fan, maybe you see the field goal go in and the AR scoreboard updates to give your team three points. But if you’re a Bills fan, maybe you see a gust of wind push the kick wide.
Or, for an even more polarized (again, exaggerated!) example: You’re walking down the street and see a Kamala Harris campaign poster. If you’re a Democrat, your glasses might overlay a video of her talking about her policies. But if you’re a Trump supporter, maybe your glasses overlay a video of Kamala saying she supports mass illegal immigration and—wait for it—the eating of your pets (it’s a joke, relax).

I’m not saying this will happen, but I don’t think enough people in tech or tech journalism are considering these possibilities as we charge headfirst into this spatial and ambient computing age. It’s going to be much harder to dismiss polarizing conspiracies when your relative at Thanksgiving says, ‘I saw it happen in real life,’ instead of, ‘I saw it online.’

So, what do I think will come after the smartphone? Obviously, I don’t know for sure, but I’m leaning towards headphones and voice as the next modality. Stick a pair of cameras in your AirPods, and you’ve got yourself a discreet but powerful multimodal computer to work with an LLM.

Most people don’t wear glasses, but almost everyone wears headphones. During the workday, my AirPods are in 99% of the time—even if I’m not actively listening to something, they’re so light I forget they’re there. Over the past few years, Apple has moved towards this always-in approach, adding adaptive audio features that switch between noise cancellation and transparency mode based on your environment. They’ve even added conversational awareness, so your headphones automatically lower the volume when you’re speaking to someone and bring it right back when you’re done. And now, the AirPods Pro 2 can even be used as an over-the-counter hearing aid.

A lifestyle shot shows a person wearing AirPods Pro 2 while listening to other people.

Will we need some form of visual interface? Probably. But I’m not smart enough to think about that right now. So, for now, you heard it here first: I think the first breakthrough and mainstream AI device will be AirPods with cameras in them… (or whatever Jony Ive comes up with, #hedging)

Ben's Bytes

Discussion about this post