omdTranscriptionService
The omdTranscriptionService class provides an interface to an AI-powered transcription service for handwritten content. It sends image data to a server-side endpoint for processing, abstracting away the complexities of AI model interaction and API key management.
Class Definition
export class omdTranscriptionService
Constructor
new omdTranscriptionService([options])
Creates a new omdTranscriptionService instance.
options(object, optional): Configuration options for the service:endpoint(string): The server endpoint for the transcription service. Defaults to'/.netlify/functions/transcribe'.defaultProvider(string): The default transcription provider to use. Defaults to'gemini'.
Public Properties
options(object): The configuration options for the service, includingendpointanddefaultProvider.
Public Methods
async transcribe(imageBlob, [options])
Transcribes an image containing handwritten content by sending it to the configured server endpoint. The image is converted to base64 before transmission.
imageBlob(Blob): The image blob to transcribe.options(object, optional): Transcription options:prompt(string): A custom prompt for the transcription service. If not provided, a default mathematical transcription prompt is used.
- Returns:
Promise<object>- A promise that resolves with the transcription result, containing thetext,provider, andconfidence. - Throws:
Errorif the API call fails.
async transcribeWithFallback(imageBlob, [options])
Transcribes an image with a fallback mechanism. Currently, this method simply calls transcribe(), but it is designed to allow for future implementations of fallback transcription providers or strategies.
imageBlob(Blob): The image blob to transcribe.options(object, optional): Transcription options.- Returns:
Promise<object>- A promise that resolves with the transcription result.
isAvailable()
Checks if the transcription service is available. In the current implementation, this always returns true as it relies on a serverless function endpoint.
- Returns:
boolean-trueif the service is available,falseotherwise.
getAvailableProviders()
Gets the list of available transcription providers. In the current implementation, this always returns ['gemini'] as the server handles the actual provider selection.
- Returns:
Array<string>- An array of available provider names.
isProviderAvailable(provider)
Checks if a specific transcription provider is available. In the current implementation, this only returns true for the 'gemini' provider.
provider(string): The name of the provider to check.- Returns:
boolean-trueif the provider is available,falseotherwise.
Internal Methods
_getDefaultEndpoint(): Returns the default server endpoint URL for the transcription service ('/.netlify/functions/transcribe')._blobToBase64(blob): Converts animageBlobinto a base64 encoded string, suitable for sending in a JSON payload.
Example Usage
import { omdTranscriptionService } from '@teachinglab/omd';
// Create a transcription service instance
const transcriptionService = new omdTranscriptionService();
// Assume getMyImageBlob() is a function that returns an image Blob
async function getMyImageBlob() {
// Example: Create a dummy canvas and get its blob
const canvas = document.createElement('canvas');
canvas.width = 100; canvas.height = 50;
const ctx = canvas.getContext('2d');
ctx.fillText('2x + 3', 10, 30);
return new Promise(resolve => canvas.toBlob(resolve, 'image/png'));
}
// Get an image blob from a canvas or file input
const imageBlob = await getMyImageBlob();
// Transcribe the image
const result = await transcriptionService.transcribe(imageBlob, {
prompt: 'Transcribe the handwritten math equation. Return only the mathematical expression.'
});
console.log(result.text); // The transcribed text (e.g., "2x + 3")