Jeff Bezos might be the world’s biggest Star Trek fan. At one point, the Amazon founder and CEO wanted to call his e-commerce platform makeitso.com, in reference to Captain Jean-Luc Picard’s catchphrase. In 2016, after years of begging Paramount Pictures, Bezos made a cameo as an alien Starfleet official in Star Trek Beyond. So, when Amazon set out to build the AI assistant Alexa, Bezos envisioned finally realising the Star Trek computer – a benign, omniscient assistant, available everywhere.
“We really did think of it as the Star Trek computer, where it was ambient and you could simply say: ‘Computer, beam me up,'” says Mike George, Amazon’s vice president of Echo, Alexa and Appstore. Clad in a black v-neck jumper and jeans, the 20-year Amazon veteran has a booming laugh and a vague resemblance to both Bezos and Picard. George bounds into the office, all unwavering eye contact, full-body laughs and wrist-crunching high fives, his preferred form of greeting.
I meet him, and most of the Alexa executive team, on an upper floor of Amazon’s skyscraper, Day 1, in the Denny Triangle in downtown Seattle. From here, on a blue-sky morning, the Space Needle is dwarfed by the snow-capped mountains beyond. Both seem like inconsequential theatrical set pieces to the Amazon empire below. The 30 buildings that make up the company’s base are visible out towards Lake Union. More than 150 metres below is a hole in the ground where the firm is erecting more buildings. Two 30-metre-tall biospheres under construction between the skyscrapers will house 300 plant species and provide another workspace for Amazonians. The company has permits to create 9.2 million square metres of office space, enough to double its workforce. The campus is a microcosm of Amazon’s world: always looking forward, growing so fast it’s hard to keep up.
In April 2017, Amazon’s stock-market capitalisation reached $439.8 billion (£342.2bn). It is the world’s fourth-largest company, behind Apple, Alphabet and Microsoft. Bezos’ online store has come a long way from just selling books. Amazon Prime – the annual membership that includes, among other things, speedy delivery – has millions of customers in the UK and tens of millions in the US (Amazon declined to give us precise numbers for this feature). Amazon Web Services (AWS), its cloud-computing platform, underpins much of the internet, including Netflix and Spotify, and is itself a business with a $12 billion annual turnover. In February, Amazon Studios won its first three Oscars. Amazon is opening bricks-and-mortar shops in the US, is leasing a fleet of 40 cargo planes and has launched a publishing arm. Mechanical Turk, its online marketplace, has hundreds of thousands of regular contributors. The company is testing 30-minute Prime Air drone deliveries in Cambridge, UK, where it plans to hire 400 additional staff for its machine-learning research and development centre. In June, it bought supermarket chain Whole Foods, with its 400-plus retail sites, for $13.7 billion.
AI has long been at the core of Amazon’s business. “A lot of the value we’re getting from machine learning is actually happening beneath the surface,” Bezos said at the Internet Association’s annual gala in May. “It is things such as improved search results, improved product recommendations, improved forecasting for inventory management and hundreds of other things.”
With the introduction of Alexa in November 2014, Amazon has entered what Bezos calls AI’s “golden age”. Alexa is the public face of Amazon’s AI efforts; the facilitator that will help customers navigate – and consume – Amazon’s empire. It is an empire that began selling books, and now offers its own music, films and hardware, along with your daily essentials and groceries. Amazon has grown to a behemoth – but before it was held back by the rigid interface of the web. By opening the platform to third-party developers and brands, Amazon wants to introduce Alexa into every area of your life: your home, car, hospital, workplace. The everything store is about to be everywhere.
Rohit Prasad is on the precipice of something historic – and it’s written all over his face. Amazon’s head scientist has spent his career working in natural language and speech recognition at BBN Technologies for clients such as the Defense Advanced Research Projects Agency. When Amazon approached him in 2013 about the creation of a voice-powered AI, it was the chance he’d been waiting for.
Now, sitting in Amazon’s Boston office, high above the academic quarter and overlooking the gleaming Charles River, Prasad, dressed in a striped shirt and suit trousers, gesticulates energetically across the conference table. “My eyes lit up. For a long time in speech and language, we said the ultimate application is when you are liberated from eyes and hands,” he says. “I was up to the challenge.”
Samsung spent £167m on an Alexa rival – but didn’t put any of its tech into Bixby
The vision for Alexa had already been decided upon before Prasad arrived. Like every new idea at Amazon, a press release was crafted, describing an ambient device that would wake when you called its name from across the room. Its personality would mirror Amazon’s own brand guidelines: smart, humble, helpful. It also had to be human-like, not robotic. The team at Lab126 – Amazon’s secretive California-based research division responsible for developing the Kindle, Fire TV Stick and ill-fated Fire Phone – worked backwards from that original document. (Even earlier, former Lab126 employees say the project began life as part of a shelved augmented-reality project, something Amazon has never confirmed.) Around this time, Amazon acquired two AI startups, YAP in Korea and Cambridge-based Evi, the foundations of which make up Alexa’s voice technology. But developing the product, Prasad says, required cracking machine-learning challenges.
In part, that was down to Bezos’ high standards. As one researcher told Bloomberg, “There was almost an irrational expectation around the functionality of the device. Jeff had a vision of full integration into every part of the shopping experience.”
When the Echo launched in 2014, it was almost instantly embraced. To date, Amazon has sold tens of millions of Alexa-enabled devices. “It was fundamentally new and different. No one had done that before,” says George.
“It was a game-changer and made it much easier to interact,” says Erik Brynjolfsson, co-author of Machine, Platform, Crowd: Harnessing the Digital Revolution and director of the MIT Initiative on the Digital Economy. “It’s taking something that you could have done before, in theory, but making it frictionless.”
At launch, there was no competition to speak of: Google Home didn’t appear until October 2016. Assistants trapped in smartphones, such as Apple’s Siri and Microsoft’s Cortana, face a key hurdle: few of us want to talk to an AI in public.
The stakes are enormous. The race for voice assistants is the race to be the next ubiquitous interface. In theory, it will replace not only the touchscreen, but the search bar. In December, Apple will begin shipping its Siri-enabled HomePod. Microsoft has revealed Cortana-powered speakers, made by third parties. Google, which in April declared itself an “AI-first company”, opened its APIs to accelerate the growth of Google Home, and is making major technical advances in image recognition and translation. In May, its DeepMind program AlphaGo beat the world’s best Go player. Facebook, too, is investing in AI, as is China’s Baidu.
It was Amazon, though, that first worked out how to get the public to invite AI into their homes and lives. The third-party voice apps, called Skills, instantly made Alexa more useful than the likes of Siri and Google Assistant. While its competitors misunderstand questions and serve up awkward search results, Alexa provides useful services such as turning on your lights, calling an Uber and teaching you to speak Chinese. There are more than 12,000 Skills available. Google’s AI advantage – and perhaps Amazon’s biggest threat – is its access to search, a seemingly endless font of knowledge. But Alexa has another trump card: connections to Amazon’s vast logistics and retail empire. Home can tell you things; Alexa can bring you things – within hours. If Google Home is the smart friend at a party, Alexa is the benign butler.
From the start, Amazon built Alexa by appointing laser-focused “satellite teams”, which concentrated on the best way for it to be integrated into each of Amazon’s retail sectors. “We have thousands of people working on Alexa across different domains and at the foundational science level,” George says. “Name the domain, name the type of interaction, and we form these single-threaded teams to go after them.”
AI-powered lip sync puts old words into Obama’s new mouth
There are teams working to constantly tweak Alexa’s personality, intonation and knowledge base in response to customer feedback, all to give the impression of an omni-potent, human-like assistant. Analysts at the Boston campus study internal alerts relating to popular unanswered Alexa questions. This helps Amazon decide what capabilities to introduce to Alexa’s personality, and what gaps in its knowledge must be filled to keep up the pretence that it’s a fully formed AI.
A lot of Alexa’s “human-like” intricacies come down to rigorous analytics and customer feedback. Amazon regards itself as the most customer-centric company on the planet, a “Jeffism” which was mentioned in conversations with all 11 executives I spoke to. Using this approach, the Alexa team built frameworks to detect trending news that it might be asked questions on, so it’s never caught short. The AI learns annual and seasonal event calendars, so it knows what a person is referring to when they ask, “What’s the score?” New capabilities, such as Alexa being able to identify a song from its lyrics, also point towards that carefully constructed, human-like ideal; it’s based on how you might ask a friend about a song in conversation. “Since Alexa is built in the cloud, we can add new features every week,” George explains.
When Prince died in April 2016, the Alexa team chose to make the AI’s responses to related questions more sensitive, since he was so well-loved. This does not automatically apply to all famous figures and consequently remains something the team manually introduces to avoid awkward outcomes. New Skills released in April allow Alexa to whisper, pause, take a breath and adjust its pitch, while “speech cons” released in the UK and Germany at the same time allow for keywords, such as “yay” and “ahem”, to be emphasised in more engaging ways.
Amazon Echo’s Competitors
- Google Home, a hands-free smart speaker, is powered by Google Assistant, which can tap into the search engine’s Knowledge Graph to answer simple questions. The company has recently made major artificial intelligence advances in translation and image recognition.
- The HomePod, a 17cm-tall smart-home device launches in December. The company promises high audio quality, and gave it a $349 price tag to match. Although it lags behind on AI, it’s notable that Apple is investing in emotion-detecting technology: in 2016 it bought Emotient, which interprets facial expressions.
- The social network’s Messenger app is powered by a growing army of bots that can do everything from order a takeaway to book a taxi. There are now 100,000 bots at Facebook Artificial Intelligence Research. The company launched the open-source, deep-learning framework Caffe2 in April to create industrial-strength applications with a heavy emphasis on mobile, but Facebook seems focused on computer vision.
- The company has no Echo competitor, but audio specialist Harman Kardon is creating a Cortana-powered speaker. It aims to dominate the business space, emphasising that Office and LinkedIn data will provide a unique selling point. This is why it has focused on laptop- and phone-based AI until now.
For a notoriously secretive company, Amazon’s success with Alexa has been in being open: a lesson learned from AWS’ rapid expansion. “If you think of our lineage, close to 50 per cent of Amazon’s global unit volume comes through the fact that we opened our platform to third-party merchants,” George says. “With AWS, we built primitive computing services in the beginning, where software developers were our primary customers. It benefited our ability to move faster, so we have this history of openness. That carried forward into the way we thought about Alexa.”
Through Voice Services, Alexa can exist in virtually any product. Through the Alexa Fund, a $100 million venture-capital effort, Amazon is also funding startups that will contribute to the platform. As a result, Alexa is being built into everything from washing machines and air purifiers to baby monitors and toothbrushes. “We’re very accepting of the fact that when we open something up, we enable people to compete with our products,” George explains. “We’re actually happy about that because it will make us better and will put Alexa in front of more people.”
“Amazon has been smart in creating platforms,” Brynjolfsson says. “It creates an ecosystem with more value. When other entities put their Skills online, Amazon benefits and consumers benefit.” Amazon is intently focused on how to speed up adoption of Skills and Alexa-enabled products. This could be through the creation of a Wi-Fi Locker, which solves the issue of entering credentials every time you buy an Echo, or new language capabilities, which lets Alexa understand that you want an Uber when you say “get me a ride.” (Initially, customers had to say “Alexa, enable Uber Skill.”). The open approach has paved the way for more profound use cases. Bob Paradiso, a New York computer engineer, creates Echo hacks for those with mobility issues. He used Alexa to design a voice-controlled hospital bed, wheelchair and entertainment system.
“One guy stuck Echo Dots on the ceiling for his disabled brother,” says Steve Rabuchin, the softly spoken vice-president of Alexa, in charge of developer relations. “It changed his life. Kids connect to their parents through Alexa to remind them to take medicines.” Rabuchin has donated Echo Dots to Evergreen Health’s neonatal intensive-care unit in Seattle where his twin daughters were born. This lets parents ask Alexa about their child’s care. Alexa is also being used by Parkinson’s patients to practice their speech. “The world is going to solve problems that we hadn’t even thought of,” George adds.
At the foundation of its business plan for hardware is Amazon’s desire to help you spend more. Dash was introduced so that you could instantly reorder everyday items at the touch of a button. In November 2016, the company introduced Alexa-only discounts to make voice shopping more appealing.
In April – on the day WIRED visited the Seattle office – Amazon unveiled the Echo Look. Fitted with a camera, it’s the first all-seeing, all-hearing Echo, a fashion assistant that can take photographs on command. Pair it with Amazon’s Style Check Skill, and its machine-learning capabilities will rate your outfit choices. Crucially, it can also make purchase suggestions.
A few days later, the company unveiled Echo Show: an Echo with a screen, which can make video calls. A promotional video shows it helping parents keep tabs on their newborns (and ordering nappies, listening to Amazon Music and watching a Prime film). Echo Show fills a gaping hole in Echo’s capabilities: for a system that is designed to sell you things, listening to Alexa read out a list of choices is at odds with its convenience-focused design. It was also a lesson in Amazon’s ruthless streak. In September 2016, Alexa-enabled home-intercom startup Nucleus said it had raised $5.6 million in funding, largely from the Alexa Fund. The Alexa team, full of praise for Nucleus in our meetings, effectively swallowed up one of its own. Its Show video even bears an uncanny resemblance to Nucleus’ own original advert – with some Amazon retail experiences thrown in for good measure.
David Limp, Amazon’s senior vice-president of devices and services, insists that Nucleus was given advanced notice about Show. “The product was not a surprise,” Limp tells me, post-launch. “I’m still a fan of the Nucleus. It’s complementary to Show. It hangs on the wall, it’s thinner. It’s a different use case in my mind. Nucleus and others can also have access to the APIs and they can be as good or potentially better than an Echo Show.”
In a scathing interview with the Recode website shortly after Show was unveiled, Nucleus founder Jonathan Frankel said, “The difference is they want to sell more detergent; we actually want to help families communicate easier. They must realise that by trying to trample over us – a premier partner in the Alexa Fund ecosystem – that they are going to really cripple that ecosystem and put a warning out for others. If they’re really willing to threaten that, it must be a huge opportunity.”
The retail opportunity for the Show and Look is enormous. But the Nucleus episode is in keeping with the Amazon that’s loved for its convenience, and sometimes loathed for the path to that convenience. A 2015 New York Times article depicted a brutal work environment at the company, one which pushed people beyond their limits. Amazon strongly disputed the account, demanding a retraction that never transpired. But it supported a similar picture painted in Brad Stone’s 2013 corporate biography The Everything Store. Most of the original team that worked on Echo at Lab126 no longer work for the firm.
The best Amazon Echo Skills and how to use them
Nevertheless, it’s clear that the conveniences Amazon’s customers enjoy, from same-day delivery to low costs, is a result of such dedicated and unwavering single-mindedness. The so-called Love Memo, also revealed in The Everything Store, gives an extraordinary insight into how that single-mindedness manifests throughout the company. The memo was drawn up by Bezos in the aftermath of criticism it suffered for launching an app designed to get members of the public comparing real-world products with Amazon’s generally cheaper alternative online. The move was considered anti-competitive and led Bezos to ponder what makes a company of Amazon’s size loved, not feared.
He drew up a list of traits that make a company loved, as in the case of Apple or Disney, or “unloved”, such as Microsoft or Walmart, and distributed it to his top executives. The Love Memo bears a striking resemblance to what it has striven to achieve with AI: “polite is cool” (Alexa is always described as polite); “young is cool” (enter the controversial Look, aimed at young people); “straightforwardness is cool” (Alexa gives no-nonsense answers); and “the unexpected is cool” (Alexa was a tightly held secret, as its follow-up products continue to be).
Amazon’s path to dominance
- Amazon.com founded as a bookstore
- Diversifies from books and electronics to include CDs and DVDs
- Marketplace opens
- AWS cloud-computing platform launches<br />S3 web-storage service established
- Kindle launches in US. Prime becomes available in UK, two years after US launch
- Amazon Studios opens
- Plans are unveiled for Prime Air, a drone delivery service
- Amazon Echo launches in the US
- On its 20th anniversary, Amazon has a customer base of 304 million and stock prices sit at $748.38
- Amazon Echo launches in UK
- Amazon Fresh grocery service arrives in uk. Amazon buys whole foods grocery chain for $13.7 billion
The depths of the secrecy around working practices at Amazon, combined with the “Jeffisms” and fingerprints of the Love Memo littered throughout our conversations, make for a uniformity of language and personality among its executives that is endearing at times, unnerving at others. Not one executive will speak to Amazon’s competition: “There will be multiple winners here,” says Limp, humbly adding, “I’m very optimistic that Alexa will be one.” (“Obsessing over competitors is not cool; capturing all the value only for the company is not cool.”) Everyone advocates the virtues of risk-taking when developing Alexa products, in the manner of a fast-growth startup. “We have a responsibility to continue to invent over the horizon for customers,” Limp explains. (“Risk-taking is cool.”)
It is uncanny, the level to which the impact of the Love Memo and Amazon’s need to be perceived as cool can be seen to proliferate through its culture. The Seattle headquarters is kitted out with a rooftop dog park, a craft centre and there are whiteboards in every lift in case of spontaneous moments of creativity (every one I saw was blank, apart from one with a single Mandarin character). A conference hall that was able to accommodate 1,600 people had recently welcomed Alec Baldwin and Al Gore to speak; the entire space is flipped into a sports court every Friday afternoon for employees to play in. At 5pm, the hallways are filled with Amazonians, canines in tow, pouring out of lifts in a scene that could not more starkly contradict the images conjured up by the New York Times article. But this is Amazon central, not a fulfilment centre outpost, or any of 30 other buildings beyond the main Doppler and Day 1 headquarters.
On the floor where I meet members of the Alexa and Echo teams, the walls and floors of the building’s empty hallways are constructed from polished black-grey concrete and marked with vast, ominous cracks. Heavy chain-mail curtains conceal the sofa cubbies where employees are expected to relax. It’s a very harsh aesthetic, and seems more akin to the Death Star than Bezos’ beloved Starship Enterprise.
Toni Reid, vice president of Alexa experience and Echo devices, spends her days overseeing a vast team of behavioural scientists and engineers, all working on refining Alexa’s personality. The goal is to understand, in aggregate, how analytics can be used to improve Alexa from the smallest conversational level to major personality features. Reid, dressed in a suit and open shirt, gingerly fingers a purple lanyard around her neck as we talk. When Reid joined the Alexa team, in what she refers to as its alpha phase, she realised that for Alexa to be human-like, it had to be more likeable.
“When my family is in the car, we automatically try to use Alexa, and she’s not there. It’s noticeable that she’s missing,” Reid says. This is the feeling they want to elicit with all customers. “Alexa should be there when you need it and disappear when you’re not using it.”
The more empathetic Alexa is, the easier it is to spend time with. “Emotion is a super hard problem,” says Prasad. “You need to know the person quite well.” If any company can do that, it’s Amazon: it knows what you wear, read, watch, listen to. Alexa may not know you yet, but Amazon knows more about you than most of your closest friends.
From the Editor: why it’s always day one for Jeff Bezos and Amazon
In September 2016, Amazon launched the Alexa Prize, which challenges university students to create a social bot that can hold a conversation for 20 minutes. Fourteen teams are competing for the £338,000 prize money. The premise is aimed at making the AI seem more human, by ensuring it can hold an engaging conversation. “Imagine you meet someone for the first time and have to have a conversation for 20 minutes – it’s hard,” Prasad says. “The person you’re meeting has to be interesting, knowledgeable and empathetic, emotional in terms of responding to all your emotional cues. That is a pretty daunting one, not just from a spoken-language understanding perspective but word knowledge. How do you respond to the non-verbal cues? That to me is an ultimate AI. That’s the next step.”
Prasad is so optimistic about the likelihood of that reality, he is already thinking about the need for checks and balances to prevent Alexa seeming a little too human. In April, Amazon released a facility for it to bleep out profanity, a subtle reminder that it is a machine.
There is a reason Alexa has to be human-like: trust. if its ultimate goal is to be everywhere, customers must trust it enough to let it – cameras and all – into their lives.
For now, Alexa can try to empathise based on words alone. It has the potential to pick up on visual cues from cameras – if the public trusts it with that content. By launching the Echo Look, Amazon has taken the first step towards that goal, by getting cameras into homes. “We want to do it right – what you don’t want to do is misread the emotional cue and do something silly,” Prasad says.
“That’s something the industry will move towards,” says Evi founder William Tunstall-Pedoe. “The more information a voice assistant has, the better it can do. At some point it will be taking visual cues and other cues. There will be privacy concerns that the camera is on, as opposed to the Look camera, which takes a photo when you tell it to. But there have been big advances in AI, with deep neural networks making sense of what is in photos.” Alexa records every utterance; people can delete these.
Wake words used for security purposes (Amazon says Alexa does not send any data to the cloud until it hears its name) could disappear if an AI learns when somebody is looking and talking in its direction, breaking down that last element of friction in the connected home.
Amazon has always emphasised how important privacy is to the company. But when an AI is everywhere, that complicates things. In an Arkansas murder case in 2017, the company refused to hand over a person’s Echo voice data – until it was legally compelled to by the judge. That’s why there is a wake word, lights and noise emitted when Alexa activates, and a mute button on the Echo. “The mute button disconnects the mic and camera,” Limp says. “If you put it in the closet, and pressed mute, no hacker could turn on the camera. It’s impossible. Unless they had a soldering iron, and you’d recognise that was happening from the smell.”
Trust has always been central to Amazon’s growth. In The Everything Store, Amazon’s then editor-in-chief Susan Benson describes how editorial quickly became important, “in creating a good shopping experience, but also in getting people comfortable about the idea that there were people on the other side of the screen that they could trust. We asked people to put a credit card into the computer, which at the time was a radical concept.” Today, there is an important reversal: customers must trust Alexa, and believe there isn’t a human watching on the other side.
The vision for Alexa as the Star Trek computer – repeated by every executive I spoke to – is steeped in nostalgia. But it also has logic: it’s not the threatening AI of Ex Machina or Her, but an optimistic vision. The association could help AI make that leap in social acceptance. Therein lies Amazon’s big bet – and Google’s, Apple’s and Microsoft’s – in the battle over AI assistants. Soon, you won’t get the chance to miss Alexa, like Reid, in the car, at your office or in a hotel (it’s installed in all 4,748 rooms in the Wynn Las Vegas Hotel) because you’ll never be away from it. As its presence spreads, the more machine learning will improve it – make it more capable and more person-like. Maybe when its voice is everywhere, privacy will become inconvenient. Life will be effortless. And underneath it all, Amazon will be there to provide your shopping and entertainment. Whatever you want, Alexa will make it so.
Liat Clark is a commissioning editor at WIRED