Mobile Finds Its Voice

Share this article

Over the last decade, mobile technology has evolved more rapidly than any other area of computing. My first mobile – purchased just 12 years ago – felt like a brick, and although it could send and receive text messages, that functionality lay so many menus deep, I never used it. As a voice-only device, my first mobile merely replicated the functionality of a landline telephone. That’s to be expected: fifty years ago, Marshall McLuhan noted that the first content of a new medium is the medium it’s making redundant.

The longer we spend with our mobiles, the more uses we find for them. SMS was an unregarded, almost accidental feature of GSM networks, which grew exponentially into the principle mobile communication mechanism. We find something, we use it, and because we use it to communicate with others, they learn about it. I didn’t send my first SMS until a friend had sent me a message which needed a reply. That same behavior, repeated a few billion times over the last decade, led to the trillions of texts sent each year by six billion mobile subscribers.

And we’ve only just begun. Every time we take another app for a test drive, every time we send a link or a map or a movie to someone else, we learn something new about the possibilities of the mobile. There’s two halves to this process: we learn and share what we’ve learned with one another; and the devices themselves come to reflect and embody what we’ve learned. Front-and-center on modern mobiles, texting has become the default communication interface, vying with Facebook messaging and Twitter for our attention. We still use mobiles to make voice calls, but almost as an afterthought.

Over the same years we have seen the rise and the fall of the keyboard. At first, the mobile keyboard looked exactly like a landline telephone. Then, driven by text messaging, it exploded into full QWERTY contraptions with impossibly small keys and microscopic trackballs. RIM’s Blackberry represents the keyboard’s finest hour, a full if miniature desktop business computer. With the advent of iPhone, the mobile keyboard suffered a fatal blow. The touchscreen, finicky and inaccurate and under the dominion of atrocious auto-correcting spelling software, became the accepted interface for text.

That’s a bit weird, actually, because just at the moment texting was becoming the principle mobile communication medium, mobiles themselves became harder to text with. Nothing in the last few years has changed that. In fact, we’re less inclined to peck at a keyboard, encouraged instead to swipe our way into communication, a new form of whole-body comprehension, something closer to sign language than composition. But even as we contemplate the triumph of touch, we are about to see it undermined.

Apple is said to be working on a sophisticated voice-recognition technology for iOS, ‘Assistant’, which will make its first appearance in a dual-core iPhone 5. Whether or not the specifics of this rumour prove to be true, embedded speech recognition is the obvious next step in the mobile’s evolution. A telephone is meant to be spoken to; the interface to the telephone should reflect this essential fact.

Great strides have been made in speech recognition, particularly within the limited domains you’re likely to encounter interacting with a mobile. The opportunities for ambiguity (always a problem with speech recognition) decrease as you reduce the problem being solved. Free-floating speech-into-text will prove somewhat problematic until a computer reliably passes the Turing Test, but as a control mechanism, speech recognition is already up to the task.

This means our anger over auto-corrected typing will soon be replaced by fury at natural language recognition that won’t recognize what we say. But it also means we won’t need to pay as much attention to our mobiles. No longer will we be caught out in the middle of the road, poking around our mobiles, as a car heads straight for us. Instead, we’ll be muttering to devices, repeating ourselves, then looking for a quiet environment to issue a few commands.

The touchscreen won’t be going away. The keyboard didn’t go away either, even if iOS absurdly reduced it to a single ‘Home’ button. In its evolution, the mobile rarely loses all contact with its hardware past. Those nearly-obsolete parts remain deep inside, like vestigial organs. The touchscreen will become less of a utility and more of a feature, something you use to play “Angry Birds”, but not to answer email; something to sketch with, not something to write with.

Talking to the mobile will change our relationship to it. Just as touchscreens have opened the door to a new class of applications, qualitatively different from applications bound to mouse or keyboard, voice recognition opens into a new, more intimate relationship with our devices. Conversation is personal; the voice is the very embodiment of our personalities. So our mobiles will develop personalities – if only because we will immediately project a personality onto them. We’ll do that because a disembodied voice is too unheimlich – uncanny – to carry around in our pockets. Rejecting that as too alien, we’ll anthropomorphise, and handset manufacturers will respond in kind, giving us a broad range of voice options. As with ringtones, within a few years, our relationships with our handsets will become so specific and unique that no one would be able to mistake the sound of their own mobile for someone else’s. We might even converse with them in a private code, securing our relationship through an in-plain-sight cryptography.

The mobile, that most personal of all our devices, is about to move even closer to our hearts. McLuhan called the telephone the most intimate of technologies, because it put another’s lips right up to your ear. As the mobile finds its voice, we will regard it less as a tool and more as a peer, something poised midway between a pet and a friend. That opens up a space for the play of words, and the joy of conversation. Strange as it sounds, we’ll be less alone. Somewhere nearby, our devices will be whispering to us.

Mark PesceMark Pesce
View Author

Known internationally as the man who fused virtual reality with the World Wide Web to invent VRML, Mark Pesce has been exploring the frontiers of media and technology for thirty years. Pesce holds an appointment as an Honorary Associate in Digital Culture at the University of Sydney, and chaired the Emerging Media and Interactive Design programs at both USC's School of Cinema, and the Australian Film Television and Radio School. For seven years, Pesce was panelist and judge on the hit ABC series The New Inventors, regularly contributes to ABC websites, has a monthly column in NETT magazine, and is currently working on his sixth book, THE NEXT BILLION SECONDS.

Share this article
Read Next
Get the freshest news and resources for developers, designers and digital creators in your inbox each week
Loading form