Class: SpeechToTextV1

SpeechToTextV1

new SpeechToTextV1(options)

Speech Recognition API Wrapper
Parameters:
Name Type Description
options
Source:

Methods

addCorpus(params, callback)

Add a corpus to a custom model Adds a single corpus text file of new training data to the custom language model. Submit a plain text file that contains sample sentences from the domain of interest to enable the service to extract words in context. The more sentences you add that represent the context in which speakers use words from the domain, the better the service's recognition accuracy. Adding a corpus does not affect the custom model until you train the model for the new data by using the Train a custom model method. Use the following guidelines to prepare a corpus text file: Provide a plain text file that is encoded in UTF-8 if it contains non-ASCII characters. The service assumes UTF-8 encoding if it encounters such characters. Include each sentence of the corpus on its own line, terminating each line with a carriage return. Including multiple sentences on the same line can degrade accuracy. Use consistent capitalization for words in the corpus. The words resource is case-sensitive; mix upper- and lowercase letters and use capitalization only when intended. Beware of typographical errors. The service assumes that typos are new words; unless you correct them before training the model, the service adds them to the model's vocabulary. The service automatically does the following: Converts numbers to their equivalent words. For example, 500 becomes five hundred, and 0.15 becomes zero point fifteen. Removes punctuation and special characters: Ignores phrases enclosed in ( ) (parentheses), < > (angle brackets), [ ] (square brackets), and { } (curly braces). Converts tokens that include certain symbols to meaningful strings. For example, the service converts a $ (dollar sign) followed by a number to its string representation. For example, $100 becomes one hundred dollars.
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Description
customization_id String The GUID of the custom language model to which a corpus is to be added. You must make the request with the service credentials of the model's owner.
corpus String | Buffer | ReadStream <optional>
the text of the corpus - may be provided as a String, a Buffer, or a ReadableStream. A ReadableStream is recommended when reading a file from disk.
callback function
Source:

addWord(params, callback)

Add a single custom word
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Description
customization_id String
word String
sounds_like Array.<String>
display_as String <optional>
callback function
Source:

addWords(params, callback)

Add multiple custom words Adds one or more custom words to a custom language model. The service populates the words resource for a custom model with out-of-vocabulary (OOV) words found in each corpus added to the model. You can use this method to add additional words or to modify existing words in the words resource. Adding or modifying custom words does not affect the custom model until you train the model for the new data by using the Train a custom model method. You add custom words by providing a Words object, which is an array of Word objects, one per word. You must use the object's word parameter to identify the word that is to be added. You can also provide one or both of the following optional fields for each word: The sounds_like field provides an array of one or more pronunciations for the word. Use the parameter to specify how the word can be pronounced by users. - Use the parameter for words that are difficult to pronounce, foreign words, acronyms, and so on. - For example, you might specify that the word IEEE can sound like i triple e. - You can specify a maximum of five sounds-like pronunciations for a word, and each pronunciation must adhere to the following rules: - Use English alphabetic characters: a-z and A-Z. - To pronounce a single letter, use the letter followed by a period, for example, N. C. A. A. for the word NCAA. - Use real or made-up words that are pronounceable in the native language, for example, shuchensnie for the word Sczcesny. - Substitute equivalent English letters for non-English letters, for example, s for ç or ny for ñ. - Substitute non-accented letters for accented letters, for example a for à or e for è. - Use the spelling of numbers, for example, seventy-five for 75. - You can include multiple words separated by spaces, but the service enforces a maximum of 40 total characters not including spaces. Yhe display_as field provides an optional different way of spelling the word in a transcript. Use the parameter when you want the word to appear different from its usual representation or from its spelling in corpora training data. For example, you might indicate that the word IBM(trademark) is to be displayed as IBM™. If you add a custom word that already exists in the words resource for the custom model, the new definition overrides the existing data for the word. If the service encounters an error with the input data, it returns a failure code and does not add any of the words to the words resource. The call returns an HTTP 201 response code if the input data is valid. It then asynchronously pre-processes the words to add them to the model's words resource. The time that it takes for the analysis to complete depends on the number of new words that you add but is generally faster than adding a corpus or training a model. You can use the List custom words or List a custom word method to review the words that you add. Words with an invalid sounds_like field include an error field that describes the problem. You can use other words methods to correct errors, eliminate typos, and modify how words are pronounced as needed.
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
words Array.<Word> Array of objects: [{word: String, sounds_like: [String, ...], display_as: String}, ...]
callback function
Source:

createCustomization(params, callback)

Creates a new empty custom voice model Response looks like: { "customization_id": "abc996ea-86ca-482e-b7ec-0f31c34e5ee9" }
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Description
name String
base_model_name String for example, en-US_BroadbandModel
description String <optional>
callback function
Source:

createRecognizeStream(params) → {RecognizeStream}

Replaces recognizeLive & friends with a single 2-way stream over websockets
Parameters:
Name Type Description
params
Source:
Returns:
Type
RecognizeStream

createSession(string)

Create a session Set-cookie header is returned with a cookie that must be used for each request using this session. The session expires after 15 minutes of inactivity.
Parameters:
Name Type Description
string model The model to use during the session
Source:

deleteCorpus(params, callback)

Delete a corpus
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

deleteCustomization(params, callback)

Delete a custom model
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

deleteSession()

Deletes the specified session.
Parameters:
Name Type Attributes Description
params.session_id String <optional>
Session id.
Source:

deleteWord(params, callback)

Delete a custom word Deletes a custom word from a custom language model. You can remove any word that you added to the custom model's words resource via any means. However, if the word also exists in the service's base vocabulary, the service removes only the custom pronunciation for the word; the word remains in the base vocabulary. Removing a custom word does not affect the custom model until you train the model with the Train a custom model method.
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
word String
callback function
Source:

getCorpora(params, callback)

List corpora Lists information about all corpora that have been added to the specified custom language model. The information includes the total number of words and out-of-vocabulary (OOV) words, name, and status of each corpus. Example Result: { corpora: [ { out_of_vocabulary_words: 0, total_words: 233, name: 'test_corpus_1', status: 'analyzed' }, { out_of_vocabulary_words: 0, total_words: 0, name: 'test_corpus_2', status: 'being_processed' } ] }
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

getCustomization(params, callback)

Get customization details Example response: { owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88', base_model_name: 'en-US_BroadbandModel', customization_id: 'e695ad30-97c1-11e6-be92-bb627d4684b9', created: '2016-10-21T19:09:33.443Z', name: 'js-sdk-test-temporary', description: 'Temporary customization to test the JS SDK. Should be automatically deleted within a few minutes.', progress: 0, language: 'en-US', status: 'pending' }
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

getCustomizations(paramsopt, callback)

List all customizations Example response: { customizations: [ { owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88', base_model_name: 'en-US_BroadbandModel', customization_id: '6a7785a0-9665-11e6-a73a-0da9193a4475', created: '2016-10-20T01:35:00.346Z', name: 'IEEE-test', description: '', progress: 0, language: 'en-US', status: 'pending' }, { owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88', base_model_name: 'en-US_BroadbandModel', customization_id: '9e2f6bb0-9665-11e6-a73a-0da9193a4475', created: '2016-10-20T01:36:27.115Z', name: 'IEEE-test', description: '', progress: 0, language: 'en-US', status: 'ready' }, { owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88', base_model_name: 'en-US_BroadbandModel', customization_id: '6b194e70-9666-11e6-a73a-0da9193a4475', created: '2016-10-20T01:42:10.903Z', name: 'IEEE-test', description: '', progress: 100, language: 'en-US', status: 'available' } ] }
Parameters:
Name Type Attributes Description
params Object <optional>
Properties
Name Type Attributes Description
language String <optional>
optional filter. Currently only en-US is supported.
callback function
Source:

getModel()

Get information about a model based on the given model_id
Parameters:
Name Type Attributes Description
params.model_id String <optional>
The desired model
Source:

getModels()

List of models available.
Source:

getRecognizeStatus()

Get the state of the engine to check if recognize is available. This is the way to check if the session is ready to accept a new recognition task. The returned state has to be 'initialized' to be able to do recognize POST.
Parameters:
Name Type Attributes Description
params.session_id String <optional>
Session used in the recognition.
Deprecated:
  • use createRecognizeStream instead
Source:

getWord(params, callback)

Get a custom word Lists information about a custom word from a custom language model. Example output: { "sounds_like": ["N. C. A. A.","N. C. double A."], "display_as": "NCAA", "source": ["corpus3","user"] }
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
word String
callback function
Source:

getWords(params, callback)

List all custom words Lists information about all custom words from a custom language model. You can list all words from the custom model's words resource, only custom words that were added or modified by the user, or only OOV words that were extracted from corpora. Example response: { "words": [ { "word": "hhonors", "sounds_like": ["hilton honors","h honors"], "display_as": "HHonors", "source": ["corpus1"] }, { "word": "ieee", "sounds_like": ["i triple e"], "display_as": "IEEE", "source": ["corpus1","corpus2"] }, { "word": "tomato", "sounds_like": ["tomatoh","tomayto"], "display_as": "", "source": ["user"] }, { "word": "$75.00", "sounds_like": ["75 dollars"], "display_as": "", "source": ["user"], "error":" Numbers are not allowed in sounds-like" } ] }
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Default Description
customization_id String
word_type String <optional>
all all|user|corpora - user shows only custom words that were added or modified by the user; corpora shows only OOV that were extracted from corpora.
callback function
Source:

observeResult()

Result observer for upcoming or ongoing recognition task in the session. This request has to be started before POST on recognize finishes, otherwise it waits for the next recognition.
Parameters:
Name Type Attributes Description
params.session_id String <optional>
Session used in the recognition.
params.interim_results boolean <optional>
If true, interim results will be returned. Default: false.
Deprecated:
  • use createRecognizeStream instead
Source:

recognize(audioopt, content_typeopt)

Speech recognition for given audio using default model.
Parameters:
Name Type Attributes Description
audio Audio <optional>
Audio to be recognized.
content_type String <optional>
Content-type
Source:

recognizeLive(content_typeopt, session_idopt)

Creates a HTTP/HTTPS request to /recognize and keep the connection open. Sets 'Transfer-Encoding': 'chunked' and prepare the connection to send chunk data
Parameters:
Name Type Attributes Description
content_type String <optional>
The Content-type e.g. audio/l16; rate=48000
session_id String <optional>
The session id
Deprecated:
  • use createRecognizeStream instead
Source:

resetCustomization(params, callback)

Reset a custom model Resets a custom language model by removing all corpora and words from the model. Resetting a custom model initializes the model to its state when it was first created. Metadata such as the name and language of the model are preserved. Only the owner of a custom model can use this method to reset the model.
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

trainCustomization(params, callback)

Train a custom model Initiates the training of a custom language model with new corpora, words, or both. After adding training data to the custom model with the corpora or words methods, use this method to begin the actual training of the model on the new data. You can specify whether the custom model is to be trained with all words from its words resources or only with words that were added or modified by the user. Only the owner of a custom model can use this method to train the model. Pre-processing of words and corpa must be complete before initiating training. Use the whenCustomizationReady() method to be notified once pre-processing has completed. Training can take on the order of minutes to complete depending on the amount of data on which the service is being trained and the current load on the service. This method triggers the callback as soon as the training process has begun. Use the whenCustomizationReady() method again to be notified once training has completed.
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Default Description
customization_id String
word_type_to_add String <optional>
all set to 'user' to train the model only on new words that were added or modified by the user; the model is not trained on new words extracted from corpora.
callback function
Source:

updateCustomization(params, callback)

Update voice model Updates information for the specified custom voice model. You can update the metadata such as the name and description of the voice model. You can also update the words in the model and their translations. A custom model can contain no more than 20,000 entries. Only the owner of a custom voice model can use this method to update the model. An example of params.words could be: [ {"word":"NCAA", "translation":"N C double A"}, {"word":"iPhone", "translation":"I phone"} ]
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Description
customization_id String
name String <optional>
description String <optional>
words Array.<Word> Array of {word, translation} objects where translation is the phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA or IBM SPR translation. A sounds-like translation consists of one or more words that, when combined, sound like the word.
callback function
Source:

upgradeCustomization(params, callback)

Upgrade a custom model Upgrades a custom language model to the latest release level of the Speech to Text service. The method bases the upgrade on the latest trained data stored for the custom model. If the corpora or words for the model have changed since the model was last trained, you must use the Train a custom model method to train the model on the new data. Only the owner of a custom model can use this method to upgrade the model. Note: This method is not currently implemented. It will be added for a future release of the API.
Parameters:
Name Type Description
params Object
Properties
Name Type Description
customization_id String
callback function
Source:

whenCustomizationReady(params)

Waits while a customization status is 'pending' or 'training', fires callback once the status is 'ready' or 'available' Note: the customization will remain in 'pending' status until at least one corpus is added. Calling this on a customization with no corpa will result in an error. See http://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#list_models for status details
Parameters:
Name Type Description
params Object
Properties
Name Type Attributes Default Description
customization_id String
interval Number <optional>
5000 (milliseconds) - how log to wait between status checks
times Number <optional>
30 maximum number of attempts
Source: