HTML5 Development Center

Developed for you in part by
 
519-webkit-speech-input

How to Use HTML5 Speech Input Fields

By | | HTML5 | Programming

I’m lost for words. Which is a shame because I could have dictated this article directly into my browser!

The recently-released Chrome 11 has speech analysis enabled by default. If you’re using Chrome, head over to the speech input demonstration page and click the microphone button icon…

speech input

Impressed? The results will depend on your accent and what you’re saying. My attempt at “HTML5 speech input” resulted in “html fonts p chip foose”! In general, though, regularly-used English words and numbers are parsed surprisingly well given that the system isn’t trained to recognize your particular dulcet tones.

Let’s take a look at the HTML code required for speech input:

<input type="text" x-webkit-speech />

Or, if you prefer XHTML-like syntax:

<input type="text" x-webkit-speech="x-webkit-speech" />

The x-webkit-speech attribute can be used on any HTML5 input element with a type of text, number, tel, or search. Unfortunately, it’s not permitted on textarea fields. I suspect that’s to stop people using it for long dictations which could result in inaccurate results or high memory usage.

The following JavaScript code can be used to test whether speech input is enabled:


if (document.createElement("input").webkitSpeech === undefined) {
	alert("Speech input is not supported in your browser.");
}

It’s unlikely you’ll require this since browsers which don’t support speech will show a standard input field. However, you could use it before assigning an ‘onwebkitspeechchange’ event handler to run a function after speech has been processed.

Speech input is one of the most innovative browser technologies to appear in recent months. It’s easy to implement and there are several obvious uses:

  • assistive dictation for those with impaired mobility
  • an alternative input option for mobile phones and tablets, and
  • any environment where a keyboard or mouse is impractical.

I suspect we’ll see weird and wonderful use in games and educational tools.

Will you add speech input support to your application? Does Chrome understand you better than your partner? All comments welcome…

Learn Responsive Web Design

Join Learnable $29 Includes all SitePoint books

Craig Buckler

Craig is a Director of OptimalWorks, a UK consultancy dedicated to building award-winning websites implementing standards, accessibility, SEO, and best-practice techniques.

More Posts - Website

{ 17 comments }

mysa vijay November 12, 2011 at 9:26 am

thanks… i learnt alot…. it`s helps people who is very lazy to type…. and this helps to the people who`s pronunciation matches..

Aneek November 8, 2011 at 9:24 am

Simply awesome.
I found a Speech Input Field in a blog. then started to search about it. Found this one. Soon going to implement it to my website’s search.

Thank you.

ldg May 21, 2011 at 9:47 am

Yes, it uses a server-side, highly sophisticated speech recognition back-end. Google has spent a lot of resources to get speech recognition to work with the Android platform ( http://developer.android.com/resources/articles/speech-input.html ), the Chrome uses a version of that service.

HTML5 provides the interface (see: http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html ), the developer needs to provide the actual speech recognition API. I assume that Chrome uses the Google API as a default (using Charles or similar, you can see the I/O), but eventually you will be able to specify alternative services.

seun May 19, 2011 at 4:06 am

brilliant. Does anyone know how to use this feature offline?

Luke May 18, 2011 at 3:32 am

Not wanting to come off as a code-nazi, but you mention HTML markup, and include an XHTML-style self-close in the code.
<input type=”text” x-webkit-speech />
Should it not be:
<input type=”text” x-webkit-speech>

Craig Buckler May 18, 2011 at 6:02 am

HTML5 browsers won’t care if you include a closing slash or not. I prefer well-formed code, but it’s up to you.

Ralph May 17, 2011 at 5:42 pm

Hmmm, is quite buggy, but fun, all the same. I voiced “dog” and it inserted “it is dangerous to dogs”. I tried again, and it read “turn off your dog”. Weird. (It’s fun to see what happens when you say naughty words, too!)

Craig Buckler May 18, 2011 at 12:26 am

Bizarre. The quality of your microphone, background noise and your accent will affect results. “Dog” worked for me.

Stormrider May 17, 2011 at 3:25 pm

I’m guessing, given the time it took to process, that the speech is uploaded analysed at Google where they can employ far more sophisticated technology than possible in a browser on a desktop PC. I’m fairly convinced that android phones do the same for speech input as well, though could be wrong.

Craig Buckler May 18, 2011 at 12:19 am

I’m not convinced that’s the case … especially given the speed of my connection!

Ahmad Alfy May 17, 2011 at 2:15 pm

IMPRESSIVE!!! WOW THIS IS AMAZING!

Victor Matson May 17, 2011 at 1:21 pm

OK, I’ll bite. How do you enable it?

Helen Natasha Moore May 17, 2011 at 5:05 pm

Chrome and a microphone?

Craig Buckler May 18, 2011 at 12:23 am

Or, more specifically, Chrome 11 and a microphone. Click Chrome’s tool icon then “About Google Chrome” to check your version.

uddhavarote@ymail.com May 18, 2011 at 12:52 am

Yep…chrome and a microphone !

Jess May 17, 2011 at 1:06 pm

I can envision applications for this in foreign language learning.

Vaishakh Ravi May 24, 2011 at 7:27 am

One such application I would like to point out on foreign language learning but with
Google Maps http://bestfromgoogle.blogspot.com/2011/05/google-chrome-speech-recognition-mashup.html

Comments on this entry are closed.