January 14, 2026
"Voice" in accessibility isn't one thing—it's at least two very different capabilities that serve different needs. Mixing them up leads to products that technically have "voice support" but don't actually help the people who need it most.
This article breaks down voice control vs. dictation in the context of accessibility, with practical guidance for product teams. For a broader overview of designing voice features that work for everyone, see our accessibility and speech guide.
Converts speech to text. You speak, words appear on screen.
Primary accessibility use case: People who can't type efficiently due to motor impairments, repetitive strain injuries, or temporary conditions.
What it enables:
Converts speech to commands. You speak, the system does something.
Primary accessibility use case: People who can't use a mouse or keyboard to navigate and control applications.
What it enables:
A user with limited hand mobility might need:
A product that only offers dictation leaves that user unable to navigate. A product that only offers voice commands doesn't help them create content. Both are needed for full accessibility.
When building dictation for accessibility:
Users may have speech differences, use communication devices, or speak in non-standard patterns. Build tolerance for:
Dictation errors are inevitable. Make correction frictionless:
If someone is using dictation because they can't type, they may also not be able to easily switch between voice and keyboard:
When building voice control for accessibility, it's worth studying how Apple's Voice Control and Microsoft's Voice Access handle common patterns:
Users shouldn't need to memorize a command vocabulary:
Don't require exact syntax. Accept variations:
Users need to know:
Misrecognition is common. When the system doesn't understand:
The most powerful accessibility voice features combine dictation and control:
Allow commands within dictation flow: "Dear John comma paragraph I wanted to follow up period send message"
Clear ways to switch between "dictation mode" (everything becomes text) and "command mode" (everything triggers actions).
The system interprets speech based on context:
No amount of design thinking substitutes for testing with actual users who rely on voice for accessibility:
The W3C Web Content Accessibility Guidelines (WCAG) provide the foundation for accessibility compliance. Our WCAG compliance guide for captions and transcripts covers the specific requirements for audio and speech features.
When evaluating speech-to-text providers for accessibility applications, accuracy matters more than in general use—errors can completely block users who have no fallback input method. See our STT API comparison for guidance on evaluating providers with accessibility in mind.
Subscribe to our newsletter for tips, exciting benefits, and product updates from the team behind Voice Control!

The ultimate language training app that uses AI technology to help you improve your oral language skills.

Simple, Secure Web Dictation. TalkaType brings the convenience of voice-to-text technology directly to your browser, allowing you to input text on any website using just your voice.

Expand the voice features of Google Gemini with read aloud and keyboard shortcuts for the built-in voice recognition.