Build a Speech‑Controlled Robot with Windows 10 IoT Core on Raspberry Pi 2
Story
Early computing relied on punch cards, trackballs, and keyboards—each requiring direct physical contact. As technology evolved, wireless and touch interfaces streamlined user interaction. Today, visual and voice input offer even more intuitive ways to control devices.
This guide demonstrates how to leverage Windows 10 IoT Core’s built‑in Speech Recognition to command a robot built on a Raspberry Pi 2. By the end, you’ll have a robot that moves, turns, and manages obstacle detection through spoken commands.
New to Windows 10 IoT Core? Start with this overview.
Updated March 30, 2016
What is Speech Recognition?
Speech Recognition translates spoken words into text. It typically involves two core components: signal processing and a speech decoder. Microsoft’s Speech SDK abstracts these complexities, allowing developers to focus on application logic.
Step 1
Getting Started with Speech Recognition
The basic workflow is:
- Create a Speech Recognition Grammar (SRGS)
- Instantiate
SpeechRecognizerand load the grammar - Subscribe to recognition events and implement handlers
Create Speech Recognition Grammar
Defining a custom grammar lets the app understand the specific commands you want the robot to accept. For this project, the vocabulary includes:
- Move Forward
- Move Reverse
- Turn Right
- Turn Left
- Stop
- Engage Obstacle Detection
- Disengage Obstacle Detection
The grammar is expressed in SRGS XML. Key structural rules:
- The root element must be
<grammar>. - It must include
version,xml:lang, and the SRGS namespace. - At least one
<rule>element is required. - Each rule must have a unique
idattribute.
For detailed SRGS guidance, consult the MSDN and W3C documentation.
Initialize Speech Recognizer and Load Grammar
SpeechRecognizer resides in the Windows.Media.SpeechRecognition namespace. Import it, then load your SRGS XML file. If compilation fails, verify that a microphone is connected and recognized by IoT Core.
Register for Speech Recognizer Events and Create Handler
Once the recognizer is running, it emits ResultGenerated when it successfully parses speech. Use this event to extract args.Result.Text and map it to robot actions. The StateChanged event informs you when the recognizer starts or stops listening.
Visual Studio can auto‑generate handler methods using the Tab key. Alternatively, register handlers immediately after creating the SpeechRecognizer instance.
Step 2
How to Drive on Parsed Speech
In the ResultGenerated handler, inspect args.Result.Text and perform conditional logic to control the robot’s motors. The MotorDriver class (included in the sample) abstracts GPIO manipulation. Full source is provided at the end of the article.
Step 3
Update Device Capability
Before deploying to the Raspberry Pi 2, add the microphone capability to your app’s package manifest. This grants the app permission to access audio input.
Once the software is ready, wire the hardware as described below.
Step 4
Deploy & Register App as Startup Application
To ensure the robot listens for commands immediately after boot, register your app as a startup application. Deploy it first, then use either PowerShell or the IoT Core Web‑Management Portal.
It’s a good idea to change the package family name before deployment to avoid conflicts.
After deployment, register the app as a startup service using the Web‑Management Portal.
If you encounter registration issues, refer to this troubleshooting guide.
After a successful registration, reboot the Raspberry Pi 2 and confirm that the app starts automatically.
Schematic
The robot’s hardware consists of a chassis with DC motors, a Raspberry Pi 2 running Windows 10 IoT Core, a 9‑12 V motor battery, a distance sensor, and power supplies. The motor battery feeds the H‑Bridge driver, while the Raspberry Pi 2 requires a dedicated 5 V source—either a USB PowerBank or a 7805 regulator.
Why Resistors with Ultrasonic Distance Sensor?
The ultrasonic sensor outputs 5 V on its Echo pin, which exceeds the Raspberry Pi 2’s 3.3 V logic level. A voltage divider (R1 = 1 kΩ, R2 = 2 kΩ) reduces the voltage to 3.3 V:
Vout = 5 × (2 kΩ / (1 kΩ + 2 kΩ)) = 3.3 V
WARNING: Directly connecting the Echo pin to a Pi GPIO will damage the board. Always use a level shifter or divider.
Final Assembly
Known Issues
Speech Recognition Won’t Work (Build 10586)
Speech recognition fails on IoT devices running Windows IoT Core build 10586.
Solution: Revert to build 10240 until Microsoft releases an update that resolves the issue.
Microphone Problem
Recognition accuracy drops with low‑quality microphones, especially at distances over 1–2 m.
Solution: Use a high‑quality or wireless microphone. If necessary, amplify the signal or consider a noise‑cancelling headset.
Recognizer Processing Delay
Speech recognition introduces a latency of 600–2000 ms. For fast‑moving robots, this delay can cause misalignment between command and action.
Solution: Current SDK versions do not reduce this delay. Future releases may offer optimizations.
Pronunciation Difference
Accents and regional pronunciations can affect recognition. Specify the language and region in the SRGS file (e.g., xml:lang="en-GB" for UK English).
Environmental Noise
Background noise can reduce accuracy. While it’s hard to eliminate, using a noise‑cancelling microphone can mitigate the issue.
USB Microphone / USB SoundCard Won’t Recognize
Starting with build 10531, Windows IoT Core supports generic USB audio devices. If your device uses a proprietary driver, it may not work.
Try a different USB microphone or sound card that uses a generic driver.
Future Enhancements
Extend the robot with visual feedback—e.g., a green LED lights up for successful commands and a red LED indicates errors. You can also add “listening” and “sleep” states to avoid accidental activations.
Did You Notice?
The animated title showcases a feature not covered in this text. Explore the animation carefully and try to implement the hidden capability.
Good luck!
Source: Speech Controlled RobotManufacturing process
- Read Heart Rate Pulses with Windows 10 IoT Core on Raspberry Pi – A Step‑by‑Step Guide
- Build a Secure Facial Recognition Door with Windows IoT and Raspberry Pi
- Build a Motion‑Controlled AWS IoT Button with Raspberry Pi, PIR Sensor, and MQTT
- Send Adafruit 10DOF IMU Data from Raspberry Pi 2 to Azure Event Hubs with Windows 10 IoT Core
- Accurate Temperature & Humidity Monitoring with SHT15 on Windows 10 IoT Core
- Installing Windows 10 IoT Core on Raspberry Pi 3 Model B+: A Step‑by‑Step Guide
- Extending GoPiGo v2 on Windows 10 IoT Core for Raspberry Pi 3
- Build a Speech-Driven Computer Vision Robot on Windows 10 IoT Core
- Build a Voice‑Controlled Robot with Arduino Nano
- Raspberry Pi 2 Home Automation with Windows 10 IoT Core: A Complete Component Guide