Monday, November 10, 2014

Guest Post: My Experiences With Voice Recognition Software

I'm pleased to bring you today a guest post from UK freelance writer John Craggs.

John is a long-standing member of my forum at, where he is better known by his forum ID of Gyppo.

John's article was originally intended as a comment on my recent blog post about using Dragon Naturally Speaking. Unfortunately it proved too long for my blogging platform (Blogger) to handle.

I therefore asked John if he would mind me publishing it as a guest post instead, to which he graciously agreed...

* * *

Nick asked if anyone else has been using - or at least experimenting with - voice recognition software.  I am in the process of 'training' the voice recognition software which comes bundled with most versions of Windows 7.  I say most because although I know it comes with 7 Pro I've heard it doesn't necessarily come with the cheaper versions of this OS. But sometimes it does.

I became interested in voice recognition software a few years ago when arthritis locked up four of my fingers - two on each hand - into claws which made typing rather difficult.  I have nearly fifty years of typing in my fingers now, dating back to manual typewriters, so a certain amount of wear and tear is perhaps inevitable.  Fortunately my rheumatologist came up with some magic pills and exercises which have unlocked the treacherous digits again, but there are still times when one hand or the other gets a bit contrary.  So when I found voice recognition hidden away in W7 Pro I had to try it.

Even viewed as just a backup for the occasional bad day it's worth trying.

Overall my comments are pretty much the same as Nick's about using Dragon.  This is to point out some of the pitfalls and benefits I've discovered.

Once you have it trained - and I'll come back to this - it's quite good for entering reasonable volumes of text.  But you do have to proof read carefully.  This is a situation where the spell checker is almost useless, because every word the software enters is a proper word, no matter how misheard it may be.

All the usual suspects will be there: to, to, too, plus through, threw, four, for, and fore, and so on.  Rather surprisingly the software does begin to recognise the difference after you've used it for a while and manually corrected the wrong choices.  (That's right, it does learn your own vocal idiosyncracies, and will detect subtle differences which you probably can't hear yourself, such as the troublesome tos.  This is because it looks at the word in context with the words either side.  Which is almost like magic.)

Even straight out of the box it can be surprisingly accurate, but if you have something of a polyglot accent like mine, where some words have long vowel sounds and some short and harsh it will take a bit longer to learn.  If you are consistently Northern or Southern English with your pronunciation the program will learn faster.  If you speak BBC English then you're more than halfway there, but very few of us do.

There are no shortcuts here.  You have to be willing to put the time in making the corrections.  I've been giving it ten to fifteen minutes a day and it already recognises some Cornish and Northern dialect words, and I've taught it some Romanes (Romany language) phrases as well.

A short daily practice session definitely earns its keep.


Now let's look at some of the limitations.  As Nick pointed out, it doesn't play nicely with every program, although perhaps a more computer savvy user could work out why and make it behave.  But I suspect most writers, like me, still see their processor more as a typewriter with a memory and a few other useful functions.  So digging into the programming corners is forbidden territory.

Just as Dragon works well with Microsoft's Word program, so does this one.  I use it in conjunction with the alternative Jarte processor, which uses the Word 'engine' anyway, so there are no major conflicts there.

I've used it sporadically for posts on the myWritersCircle forum, and it works after a fashion, but it chucks up odd spacing and spurious capitals which make my posts look like extracts from a Victorian Novel.

If my hands were really bad again I'd definitely use it and accept the need to tidy up manually with a 'hunt and peck' approach.  Anything is better than not being able to write or communicate.

During my experiments I found it fairly quick to use a hybrid approach, dictating via the microphone, but making on-the-fly corrections with the keyboard.  If you are a truly slow typist with serious dexterity problems this approach could be faster than your normal typing, but even a modest yet accurate thirty wpm will fill the pages faster.

I haven't attempted to use it for form-filling for online ordering.  I don't buy online that much, and accuracy is more important than speed when you're dishing out money.  Earlier this year I nearly bought three computers rather than one due to a form filling error ;-)


Biggest problem when using programs other than Word or Jarte...  The dictation part works well once trained, but some of the spoken commands either just don't co-operate, or work sporadically.  The latter is far more annoying.

'New paragraph' always works.  But until it's trained the phrase 'nude photograph' will sometimes trigger a new paragraph ;-)

"Spell it" should open a window where you spell out the problem word one letter at a time, then say "okay" and watch it pop into place.  Sometimes the words 'spell it' will appear on the screen instead.

Very annoying.

"Delete last sentence" or "delete last paragraph" work as expected.

"Delete" followed by the word in question usually works well.  If there are several instances it will number them with a little superscript number in a box, and 'okay' by the last example.  You can then give the number or just say "okay" and the word will vanish.

But there's a booby trap if the word you want to delete is 'all'.  If, for example, your Hampshire accent has turned Town Hall into Town All.  You say "delete all" and the screen obligingly empties.  And then makes a surprisingly good attempt at the swear words you will probably say aloud ;-)  Fortunately "undo" will let you step back through the actions and get your text back.

Which brings us to another potential hiccup.  If you are in the habit of talking to yourself when you write - and many writers are - the software will type in the spare comments you put between sentences.  Or comments called through to another room, such as "with you in a minute".  Unless you first tell it to "stop listening".

It seems quite good at ignoring other voices in the room once it's been trained to recognise yours.  It's also good at ignoring coughs, but will attempt to translate sneezes into something which looks like a Polish or Central European name.

It's definitely worth printing out the list of commands.  The common ones you'll soon learn, but the more unusual ones will sometimes elude you in mid-dictation.

Summary:  Voice recognition software will make you laugh and it will make you cry.  But if you see a real need for it then it's worth the effort of training it for ordinary typing at least.  It may not be faster, but the hybrid approach I mentioned could be a life-saver if you suffer from sporadic joint pain.

Hopefully for me it will always be a fall-back option rather than my only method of text entry, but I'm happy to spend those few minutes a day to keep it familiar.  Better to learn now than before those spiky little crystals in my knuckles leave me no other option.

If you decide to try it for yourself I suggest you treat it light-heartedly at first - almost as a game - and it will be a lot less frustrating.

* * *

Thank you to John for sharing his experiences. They actually parallel mine quite closely, even down to what happens when I sneeze!

If you have any comments or questions about this subject, as ever, please do post them below.

Blogger Ruth Barringham said...

An interesting and funny post. I've been practicing with my Mac's "Dictate" function so I can relate.

I like the "delete all" emptying the screen after saying "Town Hall" with an accent and trying to correct it. Hilarious.

9:40 PM  
Blogger Nick said...

Thanks, Ruth. Glad you enjoyed John's post.

12:34 PM  

