new SpeechToTextV1(options)
Speech Recognition API Wrapper
Parameters:
| Name |
Type |
Description |
options |
|
|
- Source:
Methods
addCorpus(params, callback)
Add a corpus to a custom model
Adds a single corpus text file of new training data to the custom language model.
Submit a plain text file that contains sample sentences from the domain of interest to enable the service to extract words in context. The more sentences you add that represent the context in which speakers use words from the domain, the better the service's recognition accuracy. Adding a corpus does not affect the custom model until you train the model for the new data by using the Train a custom model method.
Use the following guidelines to prepare a corpus text file:
Provide a plain text file that is encoded in UTF-8 if it contains non-ASCII characters. The service assumes UTF-8 encoding if it encounters such characters.
Include each sentence of the corpus on its own line, terminating each line with a carriage return. Including multiple sentences on the same line can degrade accuracy.
Use consistent capitalization for words in the corpus. The words resource is case-sensitive; mix upper- and lowercase letters and use capitalization only when intended.
Beware of typographical errors. The service assumes that typos are new words; unless you correct them before training the model, the service adds them to the model's vocabulary.
The service automatically does the following:
Converts numbers to their equivalent words. For example, 500 becomes five hundred, and 0.15 becomes zero point fifteen.
Removes punctuation and special characters:
Ignores phrases enclosed in ( ) (parentheses), < > (angle brackets), [ ] (square brackets), and { } (curly braces).
Converts tokens that include certain symbols to meaningful strings. For example, the service converts a $ (dollar sign) followed by a number to its string representation. For example, $100 becomes one hundred dollars.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Description |
customization_id |
String
|
|
The GUID of the custom language model to which a corpus is to be added. You must make the request with the service credentials of the model's owner. |
corpus |
String
|
Buffer
|
ReadStream
|
<optional>
|
the text of the corpus - may be provided as a String, a Buffer, or a ReadableStream. A ReadableStream is recommended when reading a file from disk. |
|
callback |
function
|
|
- Source:
addWord(params, callback)
Add a single custom word
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Description |
customization_id |
String
|
|
|
word |
String
|
|
|
sounds_like |
Array.<String>
|
|
|
display_as |
String
|
<optional>
|
|
|
callback |
function
|
|
- Source:
addWords(params, callback)
Add multiple custom words
Adds one or more custom words to a custom language model.
The service populates the words resource for a custom model with out-of-vocabulary (OOV) words found in each corpus added to the model.
You can use this method to add additional words or to modify existing words in the words resource.
Adding or modifying custom words does not affect the custom model until you train the model for the new data by using the Train a custom model method.
You add custom words by providing a Words object, which is an array of Word objects, one per word. You must use the object's word parameter to identify the word that is to be added. You can also provide one or both of the following optional fields for each word:
The sounds_like field provides an array of one or more pronunciations for the word. Use the parameter to specify how the word can be pronounced by users.
- Use the parameter for words that are difficult to pronounce, foreign words, acronyms, and so on.
- For example, you might specify that the word IEEE can sound like i triple e.
- You can specify a maximum of five sounds-like pronunciations for a word, and each pronunciation must adhere to the following rules:
- Use English alphabetic characters: a-z and A-Z.
- To pronounce a single letter, use the letter followed by a period, for example, N. C. A. A. for the word NCAA.
- Use real or made-up words that are pronounceable in the native language, for example, shuchensnie for the word Sczcesny.
- Substitute equivalent English letters for non-English letters, for example, s for ç or ny for ñ.
- Substitute non-accented letters for accented letters, for example a for à or e for è.
- Use the spelling of numbers, for example, seventy-five for 75.
- You can include multiple words separated by spaces, but the service enforces a maximum of 40 total characters not including spaces.
Yhe display_as field provides an optional different way of spelling the word in a transcript.
Use the parameter when you want the word to appear different from its usual representation or from its spelling in corpora training data.
For example, you might indicate that the word IBM(trademark) is to be displayed as IBM™.
If you add a custom word that already exists in the words resource for the custom model, the new definition overrides the existing data for the word.
If the service encounters an error with the input data, it returns a failure code and does not add any of the words to the words resource.
The call returns an HTTP 201 response code if the input data is valid.
It then asynchronously pre-processes the words to add them to the model's words resource.
The time that it takes for the analysis to complete depends on the number of new words that you add but is generally faster than adding a corpus or training a model.
You can use the List custom words or List a custom word method to review the words that you add. Words with an invalid sounds_like field include an error field that describes the problem.
You can use other words methods to correct errors, eliminate typos, and modify how words are pronounced as needed.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
words |
Array.<Word>
|
Array of objects: [{word: String, sounds_like: [String, ...], display_as: String}, ...] |
|
callback |
function
|
|
- Source:
createCustomization(params, callback)
Creates a new empty custom voice model
Response looks like:
{
"customization_id": "abc996ea-86ca-482e-b7ec-0f31c34e5ee9"
}
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Description |
name |
String
|
|
|
base_model_name |
String
|
|
for example, en-US_BroadbandModel |
description |
String
|
<optional>
|
|
|
callback |
function
|
|
- Source:
createRecognizeStream(params) → {RecognizeStream}
Replaces recognizeLive & friends with a single 2-way stream over websockets
Parameters:
| Name |
Type |
Description |
params |
|
|
- Source:
Returns:
-
Type
-
RecognizeStream
createSession(string)
Create a session
Set-cookie header is returned with a cookie that must be used for
each request using this session.
The session expires after 15 minutes of inactivity.
Parameters:
| Name |
Type |
Description |
string |
|
model The model to use during the session |
- Source:
deleteCorpus(params, callback)
Delete a corpus
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
deleteCustomization(params, callback)
Delete a custom model
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
deleteSession()
Deletes the specified session.
Parameters:
| Name |
Type |
Attributes |
Description |
params.session_id |
String
|
<optional>
|
Session id. |
- Source:
deleteWord(params, callback)
Delete a custom word
Deletes a custom word from a custom language model.
You can remove any word that you added to the custom model's words resource via any means.
However, if the word also exists in the service's base vocabulary, the service removes only the custom pronunciation for the word; the word remains in the base vocabulary.
Removing a custom word does not affect the custom model until you train the model with the Train a custom model method.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
word |
String
|
|
|
callback |
function
|
|
- Source:
getCorpora(params, callback)
List corpora
Lists information about all corpora that have been added to the specified custom language model.
The information includes the total number of words and out-of-vocabulary (OOV) words, name, and status of each corpus.
Example Result:
{ corpora:
[ { out_of_vocabulary_words: 0,
total_words: 233,
name: 'test_corpus_1',
status: 'analyzed' },
{ out_of_vocabulary_words: 0,
total_words: 0,
name: 'test_corpus_2',
status: 'being_processed' } ] }
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
getCustomization(params, callback)
Get customization details
Example response:
{ owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88',
base_model_name: 'en-US_BroadbandModel',
customization_id: 'e695ad30-97c1-11e6-be92-bb627d4684b9',
created: '2016-10-21T19:09:33.443Z',
name: 'js-sdk-test-temporary',
description: 'Temporary customization to test the JS SDK. Should be automatically deleted within a few minutes.',
progress: 0,
language: 'en-US',
status: 'pending' }
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
getCustomizations(paramsopt, callback)
List all customizations
Example response:
{ customizations:
[ { owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88',
base_model_name: 'en-US_BroadbandModel',
customization_id: '6a7785a0-9665-11e6-a73a-0da9193a4475',
created: '2016-10-20T01:35:00.346Z',
name: 'IEEE-test',
description: '',
progress: 0,
language: 'en-US',
status: 'pending' },
{ owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88',
base_model_name: 'en-US_BroadbandModel',
customization_id: '9e2f6bb0-9665-11e6-a73a-0da9193a4475',
created: '2016-10-20T01:36:27.115Z',
name: 'IEEE-test',
description: '',
progress: 0,
language: 'en-US',
status: 'ready' },
{ owner: '8a6f5bb1-5b2d-4a20-85a9-eaa421d25c88',
base_model_name: 'en-US_BroadbandModel',
customization_id: '6b194e70-9666-11e6-a73a-0da9193a4475',
created: '2016-10-20T01:42:10.903Z',
name: 'IEEE-test',
description: '',
progress: 100,
language: 'en-US',
status: 'available' } ] }
Parameters:
| Name |
Type |
Attributes |
Description |
params |
Object
|
<optional>
|
Properties
| Name |
Type |
Attributes |
Description |
language |
String
|
<optional>
|
optional filter. Currently only en-US is supported. |
|
callback |
function
|
|
|
- Source:
getModel()
Get information about a model based on the given model_id
Parameters:
| Name |
Type |
Attributes |
Description |
params.model_id |
String
|
<optional>
|
The desired model |
- Source:
getModels()
List of models available.
- Source:
getRecognizeStatus()
Get the state of the engine to check if recognize is available.
This is the way to check if the session is ready to accept a new recognition task.
The returned state has to be 'initialized' to be able to do recognize POST.
Parameters:
| Name |
Type |
Attributes |
Description |
params.session_id |
String
|
<optional>
|
Session used in the recognition. |
- Deprecated:
- use createRecognizeStream instead
- Source:
getWord(params, callback)
Get a custom word
Lists information about a custom word from a custom language model.
Example output:
{
"sounds_like": ["N. C. A. A.","N. C. double A."],
"display_as": "NCAA",
"source": ["corpus3","user"]
}
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
word |
String
|
|
|
callback |
function
|
|
- Source:
getWords(params, callback)
List all custom words
Lists information about all custom words from a custom language model.
You can list all words from the custom model's words resource, only custom words that were added or modified by the user, or only OOV words that were extracted from corpora.
Example response:
{
"words": [
{
"word": "hhonors",
"sounds_like": ["hilton honors","h honors"],
"display_as": "HHonors",
"source": ["corpus1"]
},
{
"word": "ieee",
"sounds_like": ["i triple e"],
"display_as": "IEEE",
"source": ["corpus1","corpus2"]
},
{
"word": "tomato",
"sounds_like": ["tomatoh","tomayto"],
"display_as": "",
"source": ["user"]
},
{
"word": "$75.00",
"sounds_like": ["75 dollars"],
"display_as": "",
"source": ["user"],
"error":" Numbers are not allowed in sounds-like"
}
]
}
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Default |
Description |
customization_id |
String
|
|
|
|
word_type |
String
|
<optional>
|
all
|
all|user|corpora - user shows only custom words that were added or modified by the user; corpora shows only OOV that were extracted from corpora. |
|
callback |
function
|
|
- Source:
observeResult()
Result observer for upcoming or ongoing recognition task in the session.
This request has to be started before POST on recognize finishes,
otherwise it waits for the next recognition.
Parameters:
| Name |
Type |
Attributes |
Description |
params.session_id |
String
|
<optional>
|
Session used in the recognition. |
params.interim_results |
boolean
|
<optional>
|
If true, interim results will be returned. Default: false. |
- Deprecated:
- use createRecognizeStream instead
- Source:
recognize(audioopt, content_typeopt)
Speech recognition for given audio using default model.
Parameters:
| Name |
Type |
Attributes |
Description |
audio |
Audio
|
<optional>
|
Audio to be recognized. |
content_type |
String
|
<optional>
|
Content-type |
- Source:
recognizeLive(content_typeopt, session_idopt)
Creates a HTTP/HTTPS request to /recognize and keep the connection open.
Sets 'Transfer-Encoding': 'chunked' and prepare the connection to send
chunk data
Parameters:
| Name |
Type |
Attributes |
Description |
content_type |
String
|
<optional>
|
The Content-type e.g. audio/l16; rate=48000 |
session_id |
String
|
<optional>
|
The session id |
- Deprecated:
- use createRecognizeStream instead
- Source:
resetCustomization(params, callback)
Reset a custom model
Resets a custom language model by removing all corpora and words from the model.
Resetting a custom model initializes the model to its state when it was first created.
Metadata such as the name and language of the model are preserved.
Only the owner of a custom model can use this method to reset the model.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
trainCustomization(params, callback)
Train a custom model
Initiates the training of a custom language model with new corpora, words, or both.
After adding training data to the custom model with the corpora or words methods, use this method to begin the actual training of the model on the new data.
You can specify whether the custom model is to be trained with all words from its words resources or only with words that were added or modified by the user.
Only the owner of a custom model can use this method to train the model.
Pre-processing of words and corpa must be complete before initiating training.
Use the whenCustomizationReady() method to be notified once pre-processing has completed.
Training can take on the order of minutes to complete depending on the amount of data on which the service is being trained and the current load on the service.
This method triggers the callback as soon as the training process has begun.
Use the whenCustomizationReady() method again to be notified once training has completed.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Default |
Description |
customization_id |
String
|
|
|
|
word_type_to_add |
String
|
<optional>
|
all
|
set to 'user' to train the model only on new words that were added or modified by the user; the model is not trained on new words extracted from corpora. |
|
callback |
function
|
|
- Source:
updateCustomization(params, callback)
Update voice model
Updates information for the specified custom voice model.
You can update the metadata such as the name and description of the voice model.
You can also update the words in the model and their translations.
A custom model can contain no more than 20,000 entries.
Only the owner of a custom voice model can use this method to update the model.
An example of params.words could be:
[
{"word":"NCAA", "translation":"N C double A"},
{"word":"iPhone", "translation":"I phone"}
]
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Description |
customization_id |
String
|
|
|
name |
String
|
<optional>
|
|
description |
String
|
<optional>
|
|
words |
Array.<Word>
|
|
Array of {word, translation} objects where translation is the phonetic or sounds-like translation for the word. A phonetic translation is based on the SSML format for representing the phonetic string of a word either as an IPA or IBM SPR translation. A sounds-like translation consists of one or more words that, when combined, sound like the word. |
|
callback |
function
|
|
- Source:
upgradeCustomization(params, callback)
Upgrade a custom model
Upgrades a custom language model to the latest release level of the Speech to Text service.
The method bases the upgrade on the latest trained data stored for the custom model.
If the corpora or words for the model have changed since the model was last trained, you must use the Train a custom model method to train the model on the new data.
Only the owner of a custom model can use this method to upgrade the model.
Note: This method is not currently implemented. It will be added for a future release of the API.
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Description |
customization_id |
String
|
|
|
callback |
function
|
|
- Source:
whenCustomizationReady(params)
Waits while a customization status is 'pending' or 'training', fires callback once the status is 'ready' or 'available'
Note: the customization will remain in 'pending' status until at least one corpus is added. Calling this on a customization with no corpa will result in an error.
See http://www.ibm.com/watson/developercloud/speech-to-text/api/v1/#list_models for status details
Parameters:
| Name |
Type |
Description |
params |
Object
|
Properties
| Name |
Type |
Attributes |
Default |
Description |
customization_id |
String
|
|
|
|
interval |
Number
|
<optional>
|
5000
|
(milliseconds) - how log to wait between status checks |
times |
Number
|
<optional>
|
30
|
maximum number of attempts |
|
- Source: