Industrial manufacturing
Industrial Internet of Things | Industrial materials | Equipment Maintenance and Repair | Industrial programming |
home  MfgRobots >> Industrial manufacturing >  >> Industrial Internet of Things >> Embedded

Why 2017 Became the Year of Voice Interfaces

In recent years, breakthroughs in automatic speech recognition (ASR) have turned voice from a novelty into a primary interface for countless devices. IEEE Spectrum dubbed 2017 the “year of voice recognition,” and ZDNet reported at CES that voice is the next major computer interface. Here’s an overview of today’s voice ecosystem and the technologies that power it.

How many of your devices converse with you?
Voice activation is now ubiquitous. Every flagship smartphone—iPhone 7, Galaxy S7—comes with always‑on voice assistants, as do smartwatches, wearables, and hearables such as Apple AirPods and Samsung Gear IconX. Many devices lack a traditional UI, making voice the only practical way to interact. New cameras (e.g., GoPro Hero 5) accept voice commands for hands‑free selfies, and car infotainment systems have made voice‑controlled audio the norm for safer driving.

The Amazon Echo sparked the conversational assistant boom. Alexa, Amazon’s voice service, ships with dozens of built‑in skills—jokes, sports scores, movie trivia—and a host of Easter eggs. Developers can extend Alexa’s capabilities with the Alexa Skills Kit (ASK). For instance, a hobbyist turned his iRobot Roomba into a voice‑controlled vacuum by creating a custom skill. Alexa also powers practical services such as ordering food, hailing rides, and controlling smart‑home devices from Whirlpool, GE, and others.

Amazon still leads the market, but competitors are closing the gap. Facebook hired Morgan Freeman as the voice of its AI assistant, which Zuckerberg built over a year and calls “Jarvis.” The system can identify users by voice, recognize faces, and grant secure access to a home. In Japan, Gatebox offers a holographic assistant named Azuma Hikari, combining a speaker, projector, camera, and motion sensors to create a more immersive experience.

How does far‑field voice pickup work?
Understanding spoken commands while background music plays—or across a room—requires several advanced technologies:

Deep learning powers modern ASR. Neural networks trained on massive datasets now match or exceed human performance in speech and visual recognition. Once trained, the models run in real time on device hardware.

Adaptive beamforming enhances reliability by focusing on the speaker’s direction. Devices like the Echo use a hexagonal microphone array (seven mics) to detect signal arrival times, track a moving speaker, and separate multiple voices.

Why 2017 Became the Year of Voice Interfaces
Beamforming using a hexagonal microphone array (Source: CEVA)

Acoustic echo cancellation removes the device’s own audio (music, spoken responses) so it can still listen for user commands. The system models the sound it generates, either by analyzing output data or by using a dedicated mic, and subtracts it from the incoming signal. This allows users to interrupt (“barge‑in”) even while music plays.

Why 2017 Became the Year of Voice Interfaces
Acoustic echo cancellation (Source: CEVA)

Additional echo‑reduction techniques, such as dereverberation, filter out unpredictable reflections from walls and other surfaces, ensuring the ASR engine receives a clean voice sample.

Today’s voice interfaces are still evolving
While 2017 marks a significant milestone in voice adoption, many challenges remain—latency, context awareness, multi‑speaker support, and privacy. I’ll explore these shortcomings in a future post, so stay tuned.

Eran Belaish is Product Marketing Manager for CEVA’s audio and voice portfolio, developing solutions from voice triggering to wireless audio. When not immersed in sound technology, he enjoys freediving in silent, underwater worlds.

Embedded

  1. Using the Command-Line Interface with SPICE
  2. Discover RP Platform’s Innovations at formnext 2017 – Meet Our Expert Team
  3. How Voice Interfaces Are Democratizing Interaction: Trends, Tech, and Market Growth
  4. Why Companies Are Building Custom Voice Agents to Secure Data and Drive Automation
  5. Alexa‑Controlled ARDrone 2.0 Demo: Voice‑Activated Drone with Raspberry Pi & Hologram Nova
  6. New Year’s Maintenance Reset: Cost‑Effective Strategies for 2024
  7. Motion Industries Named Supplier of the Year by Heinz North America
  8. Why 2017 Became the Year of Automation
  9. CMMS Trends 2019: Customer-Centric Innovations Driving the Industry
  10. Atlas Copco 2017 Highlights: Growth, Partnerships, and New Leadership