`puter.ai.txt2speech()`

Websites Puter Apps Node.js Workers

Converts text into speech using AI. Supports multiple languages and voices.

Syntax

puter.ai.txt2speech(text, testMode = false)
puter.ai.txt2speech(text, options)
puter.ai.txt2speech(text, language, testMode = false)
puter.ai.txt2speech(text, language, voice, testMode = false)
puter.ai.txt2speech(text, language, voice, engine, testMode = false)

Parameters

`text` (String) (required)

A string containing the text you want to convert to speech. The text must be less than 3000 characters long. Defaults to AWS Polly provider when no options are provided.

`testMode` (Boolean) (optional)

When true, the call returns a sample audio so you can perform tests without incurring usage. Defaults to false.

`options` (Object) (optional)

Additional settings for the generation request. Available options depend on the provider.

Option	Type	Description
`provider`	`String`	TTS provider to use. `'aws-polly'` (default), `'openai'`, `'elevenlabs'`
`model`	`String`	Model identifier (provider-specific)
`voice`	`String`	Voice ID used for synthesis (provider-specific)
`test_mode`	`Boolean`	When `true`, returns a sample audio without using credits

AWS Polly Options

Available when provider: 'aws-polly' (default):

Option	Type	Description
`voice`	`String`	Voice ID. Defaults to `'Joanna'`. See available voices
`engine`	`String`	Synthesis engine. Available: `'standard'` (default), `'neural'`, `'long-form'`, `'generative'`
`language`	`String`	Language code. Defaults to `'en-US'`. See supported languages
`ssml`	`Boolean`	When `true`, text is treated as SSML markup

OpenAI Options

Available when provider: 'openai':

Option	Type	Description
`model`	`String`	TTS model. Available: `'gpt-4o-mini-tts'` (default), `'tts-1'`, `'tts-1-hd'`
`voice`	`String`	Voice ID. Available: `'alloy'` (default), `'ash'`, `'ballad'`, `'coral'`, `'echo'`, `'fable'`, `'nova'`, `'onyx'`, `'sage'`, `'shimmer'`
`response_format`	`String`	Output format. Available: `'mp3'` (default), `'wav'`, `'opus'`, `'aac'`, `'flac'`, `'pcm'`
`instructions`	`String`	Additional guidance for voice style (tone, speed, mood, etc.)

For more details about each option, see the OpenAI TTS API reference.

ElevenLabs Options

Available when provider: 'elevenlabs':

Option	Type	Description
`model`	`String`	TTS model. Available: `'eleven_multilingual_v2'` (default), `'eleven_flash_v2_5'`, `'eleven_turbo_v2_5'`, `'eleven_v3'`
`voice`	`String`	Voice ID. Defaults to `'21m00Tcm4TlvDq8ikWAM'` (Rachel sample voice)
`output_format`	`String`	Output format. Defaults to `'mp3_44100_128'`
`voice_settings`	`Object`	Voice tuning options (stability, similarity boost, speed)

For more details about each option, see the ElevenLabs API reference.

Return value

A Promise that resolves to an HTMLAudioElement. The element’s src points at a blob or remote URL containing the synthesized audio.

Examples

Convert text to speech (Shorthand)

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Speak!</button>
    <script>
        document.getElementById('play').addEventListener('click', ()=>{
            puter.ai.txt2speech(`Hello world! Puter is pretty amazing, don't you agree?`).then((audio)=>{
                audio.play();
            });
        });
    </script>
</body>
</html>

Convert text to speech using options

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Speak with options!</button>
    <script>
        document.getElementById('play').addEventListener('click', ()=>{
            puter.ai.txt2speech(`Hello world! This is using a neural voice.`, {
                voice: "Joanna",
                engine: "neural",
                language: "en-US"
            }).then((audio)=>{
                audio.play();
            });
        });
    </script>
</body>
</html>

Use OpenAI voices

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Use OpenAI voice</button>
    <script>
        document.getElementById('play').addEventListener('click', async ()=>{
            const audio = await puter.ai.txt2speech(
                "Hello! This sample uses the OpenAI alloy voice.",
                {
                    provider: "openai",
                    model: "gpt-4o-mini-tts",
                    voice: "alloy",
                    response_format: "mp3",
                    instructions: "Sound cheerful but not overly fast."
                }
            );
            audio.play();
        });
    </script>
</body>
</html>

Use ElevenLabs voices

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <button id="play">Use ElevenLabs voice</button>
    <script>
        document.getElementById('play').addEventListener('click', async ()=>{
            const audio = await puter.ai.txt2speech(
                "Hello! This sample uses an ElevenLabs voice.",
                {
                    provider: "elevenlabs",
                    model: "eleven_multilingual_v2",
                    voice: "21m00Tcm4TlvDq8ikWAM",
                    output_format: "mp3_44100_128"
                }
            );
            audio.play();
        });
    </script>
</body>
</html>

Compare different engines

<html>
<head>
    <style>
        body { font-family: Arial, sans-serif; max-width: 600px; margin: 0 auto; padding: 20px; }
        textarea { width: 100%; height: 80px; margin: 10px 0; }
        button { margin: 5px; padding: 10px 15px; cursor: pointer; }
        .status { margin: 10px 0; padding: 5px; font-size: 14px; }
    </style>
</head>
<body>
    <script src="https://js.puter.com/v2/"></script>
    
    <h1>Text-to-Speech Engine Comparison</h1>
    
    <textarea id="text-input" placeholder="Enter text to convert to speech...">Hello world! This is a test of the text-to-speech engines.</textarea>
    
    <div>
        <button onclick="playAudio('standard')">Standard Engine</button>
        <button onclick="playAudio('neural')">Neural Engine</button>
        <button onclick="playAudio('generative')">Generative Engine</button>
    </div>
    
    <div id="status" class="status"></div>

    <script>
        const textInput = document.getElementById('text-input');
        const statusDiv = document.getElementById('status');
        
        async function playAudio(engine) {
            const text = textInput.value.trim();
            
            if (!text) {
                statusDiv.textContent = 'Please enter some text first!';
                return;
            }
            
            if (text.length > 3000) {
                statusDiv.textContent = 'Text must be less than 3000 characters!';
                return;
            }
            
            statusDiv.textContent = `Converting with ${engine} engine...`;
            
            try {
                const audio = await puter.ai.txt2speech(text, {
                    voice: "Joanna",
                    engine: engine,
                    language: "en-US"
                });
                
                statusDiv.textContent = `Playing ${engine} audio`;
                audio.play();
            } catch (error) {
                statusDiv.textContent = `Error: ${error.message}`;
            }
        }
    </script>
</body>
</html>

txt2vid()

txt2img()

On this page

puter.ai.txt2speech()