From: Bruce J. Weimer, MD
Subject: Windows SAPI5 Speech Recognition
Date: 
Message-ID: <x-ydnaiYKO_-HyuiRVn-hg@mminternet.net>
I'm using Xanalys LispWorks for Windows Personal Edition in a PC based robot
I'm building as a hobby.  I've interfaced MicroSoft's SAPI5 speech sdk for
text to speech, but I'm having trouble gettng the speech recognition part to
work.  I'm using their VBasic examples as an approximate guide to the
interface.  And this is what I have so far:

(require "com")
(require "automation")

(defparameter *sp-shared-reco-context*)
(defparameter *create-grammar*)

;;  From the SAPI5 manual:
;;  To create an ISpRecoContext for a shared ISpRecognizer,
;;  an application need only call COM's CoCreateInstance on
;;  the component CLSID_SpSharedRecoContext.

;;  VBasic code sample:
;;  Public WithEvents RC As SpSharedRecoContext
;;  Public myGrammar As ISpeechRecoGrammar

;;  Set RC = New SpSharedRecoContext
;;  Set myGrammar = RC.CreateGrammar(0)
;;  myGrammar.DictationLoad
;;  myGrammar.DictationSetState SGDSActive

;;  Then, ignoring "unwind-protect" for a moment:

(com:co-initialize)

(setf *sp-shared-reco-context* (com:create-instance
"SAPI.SpSharedRecoContext"
                                                    :riid 'com:i-dispatch))

(setf *create-grammar* (com:simple-dispatch-method *sp-shared-reco-context*
"CreateGrammar"))

(com:simple-dispatch-method *create-grammar* "DictationLoad")

(com:simple-dispatch-method *create-grammar* "DictationSetState" 1)

;;  And, with the above code, the microphone turns on and appears to be
active. So far so good...



;;  However, this next VBasic line's got me - I don't know how to implement
this:
;;  ISpeechRecoResult.PhraseInfo.GetText



;;  Anyway, then I release everything 'manually':

(com:release *create-grammar*)

(com:release *sp-shared-reco-context*)

;;  The end


So how do I "ISpeechRecoResult.PhraseInfo.GetText" so that I can capture a
text string?  The following "Recognition Event" information comes from the
SAPI5 manual, but I don't know how to capture it:



Recognition Event


The Recognition event occurs when the speech recognition (SR) engine
produces a recognition.

This could be considered the most important event for speech recognition
because it returns the result of a successful recognition. A successful
recognition is recognized a word or phrase that is matched in an open
grammar for that recognition context and whose quality of speech meets a
minimum confidence score. If neither criteria is met, the engine returns a
FalseRecognition event. Spoken content may not meet the confidence score for
several reasons including background interference, inarticulate speech or an
uncommon word or phrase.

The member Result contains the recognition result object and from that may
derive much of the information about the speech.



SpeechRecoContext.Recognition(
     StreamNumber As Long,
     StreamPosition As Variant,
     RecognitionType As SpeechRecognitionType,
     Result As ISpeechRecoResult
)
Parameters
  StreamNumber
  Specifies the stream number owning the recognition.
  StreamPosition
  Specifies the position within the stream.
  RecognitionType
  A SpeechRecognitionType constant that specifies the RecognitionType or the
recognition state of the engine.
  Result
  An ISpeechRecoResult object containing the recognition results.


Anyway, any suggestions would be greatly appreciated.  In fact, I'd be
willing to pay (reasonably) for some consulting help on this...

Thanks in advance.

Bruce.