puter.ai.txt2speech()Converts text into speech using AI. Supports multiple languages and voices.
puter.ai.txt2speech(text)
puter.ai.txt2speech(text, options)
puter.ai.txt2speech(text, language)
puter.ai.txt2speech(text, language, voice)
puter.ai.txt2speech(text, language, voice, engine)
text (String) (required)
            A string containing the text you want to convert to speech. The text must be less than 3000 characters long.
options (Object) (optional)
            An object containing the following optional properties:
language (String): Language code for speech synthesis (AWS Polly only). Defaults to en-US.voice (String): Voice ID used for synthesis. Defaults to Joanna (AWS) or alloy (OpenAI).engine (String): AWS Polly engine. Can be standard, neural, long-form, or generative. Defaults to standard.provider (String): TTS provider to use. Supports 'aws-polly' (default) and 'openai'.model (String): OpenAI text-to-speech model (gpt-4o-mini-tts, tts-1, tts-1-hd, ...). Defaults to gpt-4o-mini-tts.response_format (String): Desired OpenAI output format (mp3, wav, opus, aac, flac, pcm). Defaults to mp3.instructions (String): Additional guidance for OpenAI voices (tone, pacing, style, etc.).language (String) (optional)
            AWS Polly only.
The language to use for speech synthesis. Defaults to en-US. The following languages are supported:
ar-AE)ca-ES)yue-CN)cmn-CN)da-DK)nl-BE)nl-NL)en-AU)en-GB)en-IN)en-NZ)en-ZA)en-US)en-GB-WLS)fi-FI)fr-FR)fr-BE)fr-CA)de-DE)de-AT)hi-IN)is-IS)it-IT)ja-JP)ko-KR)nb-NO)pl-PL)pt-BR)pt-PT)ro-RO)ru-RU)es-ES)es-MX)es-US)sv-SE)tr-TR)cy-GB)voice (String) (optional)
            The voice to use for speech synthesis. Defaults to Joanna when provider is aws-polly, or alloy when using the OpenAI provider.
alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, and shimmer.engine (String) (optional)
            AWS Polly only.
The speech synthesis engine to use. Can be standard, neural, long-form, or generative. Defaults to standard. Higher-end engines provide better quality but may incur higher usage costs.
provider (String) (optional)
            Selects which backend performs the synthesis. Use 'aws-polly' (default) for the existing AWS voices, or 'openai' to access the GPT-4o mini TTS family.
model (String) (optional)
            OpenAI provider only.
Specifies which OpenAI TTS model to use. Defaults to gpt-4o-mini-tts. Other available models include tts-1 and tts-1-hd.
response_format (String) (optional)
            OpenAI provider only.
Controls the output format when using OpenAI. Defaults to mp3, but you can request wav, opus, aac, flac, or pcm for different latency/quality characteristics.
instructions (String) (optional)
            OpenAI provider only.
Supply extra guidance for voice style (tone, speed, mood, etc.). This text is passed directly to the model.
A Promise that resolves to an HTMLAudioElement. The element’s src points at a blob or remote URL containing the synthesized audio.
Convert text to speech (Shorthand)
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Speak!</button>
    <script>
        document.getElementById('play').addEventListener('click', ()=>{
            puter.ai.txt2speech(`Hello world! Puter is pretty amazing, don't you agree?`).then((audio)=>{
                audio.play();
            });
        });
    </script>
</body>
</html>
Convert text to speech using options
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Speak with options!</button>
    <script>
        document.getElementById('play').addEventListener('click', ()=>{
            puter.ai.txt2speech(`Hello world! This is using a neural voice.`, {
                voice: "Joanna",
                engine: "neural",
                language: "en-US"
            }).then((audio)=>{
                audio.play();
            });
        });
    </script>
</body>
</html>
Use OpenAI voices
<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Use OpenAI voice</button>
    <script>
        document.getElementById('play').addEventListener('click', async ()=>{
            const audio = await puter.ai.txt2speech(
                "Hello! This sample uses the OpenAI alloy voice.",
                {
                    provider: "openai",
                    model: "gpt-4o-mini-tts",
                    voice: "alloy",
                    response_format: "mp3",
                    instructions: "Sound cheerful but not overly fast."
                }
            );
            audio.play();
        });
    </script>
</body>
</html>
Compare different engines
<html>
<head>
    <style>
        body { font-family: Arial, sans-serif; max-width: 600px; margin: 0 auto; padding: 20px; }
        textarea { width: 100%; height: 80px; margin: 10px 0; }
        button { margin: 5px; padding: 10px 15px; cursor: pointer; }
        .status { margin: 10px 0; padding: 5px; font-size: 14px; }
    </style>
</head>
<body>
    <script src="https://js.puter.com/v2/"></script>
    
    <h1>Text-to-Speech Engine Comparison</h1>
    
    <textarea id="text-input" placeholder="Enter text to convert to speech...">Hello world! This is a test of the text-to-speech engines.</textarea>
    
    <div>
        <button onclick="playAudio('standard')">Standard Engine</button>
        <button onclick="playAudio('neural')">Neural Engine</button>
        <button onclick="playAudio('generative')">Generative Engine</button>
    </div>
    
    <div id="status" class="status"></div>
    <script>
        const textInput = document.getElementById('text-input');
        const statusDiv = document.getElementById('status');
        
        async function playAudio(engine) {
            const text = textInput.value.trim();
            
            if (!text) {
                statusDiv.textContent = 'Please enter some text first!';
                return;
            }
            
            if (text.length > 3000) {
                statusDiv.textContent = 'Text must be less than 3000 characters!';
                return;
            }
            
            statusDiv.textContent = `Converting with ${engine} engine...`;
            
            try {
                const audio = await puter.ai.txt2speech(text, {
                    voice: "Joanna",
                    engine: engine,
                    language: "en-US"
                });
                
                statusDiv.textContent = `Playing ${engine} audio`;
                audio.play();
            } catch (error) {
                statusDiv.textContent = `Error: ${error.message}`;
            }
        }
    </script>
</body>
</html>