puter.ai.txt2speech()Converts text into speech using AI. Supports multiple languages and voices.
puter.ai.txt2speech(text, testMode = false)
puter.ai.txt2speech(text, options)
puter.ai.txt2speech(text, language, testMode = false)
puter.ai.txt2speech(text, language, voice, testMode = false)
puter.ai.txt2speech(text, language, voice, engine, testMode = false)
text (String) (required)
A string containing the text you want to convert to speech. The text must be less than 3000 characters long. Defaults to AWS Polly provider when no options are provided.
testMode (Boolean) (optional)
When true, the call returns a sample audio so you can perform tests without incurring usage. Defaults to false.
options (Object) (optional)
Additional settings for the generation request. Available options depend on the provider.
| Option | Type | Description |
|---|---|---|
provider |
String |
TTS provider to use. 'aws-polly' (default), 'openai', 'elevenlabs' |
model |
String |
Model identifier (provider-specific) |
voice |
String |
Voice ID used for synthesis (provider-specific) |
test_mode |
Boolean |
When true, returns a sample audio without using credits |
Available when provider: 'aws-polly' (default):
| Option | Type | Description |
|---|---|---|
voice |
String |
Voice ID. Defaults to 'Joanna'. See available voices |
engine |
String |
Synthesis engine. Available: 'standard' (default), 'neural', 'long-form', 'generative' |
language |
String |
Language code. Defaults to 'en-US'. See supported languages |
ssml |
Boolean |
When true, text is treated as SSML markup |
Available when provider: 'openai':
| Option | Type | Description |
|---|---|---|
model |
String |
TTS model. Available: 'gpt-4o-mini-tts' (default), 'tts-1', 'tts-1-hd' |
voice |
String |
Voice ID. Available: 'alloy' (default), 'ash', 'ballad', 'coral', 'echo', 'fable', 'nova', 'onyx', 'sage', 'shimmer' |
response_format |
String |
Output format. Available: 'mp3' (default), 'wav', 'opus', 'aac', 'flac', 'pcm' |
instructions |
String |
Additional guidance for voice style (tone, speed, mood, etc.) |
For more details about each option, see the OpenAI TTS API reference.
Available when provider: 'elevenlabs':
| Option | Type | Description |
|---|---|---|
model |
String |
TTS model. Available: 'eleven_multilingual_v2' (default), 'eleven_flash_v2_5', 'eleven_turbo_v2_5', 'eleven_v3' |
voice |
String |
Voice ID. Defaults to '21m00Tcm4TlvDq8ikWAM' (Rachel sample voice) |
output_format |
String |
Output format. Defaults to 'mp3_44100_128' |
voice_settings |
Object |
Voice tuning options (stability, similarity boost, speed) |
For more details about each option, see the ElevenLabs API reference.
A Promise that resolves to an HTMLAudioElement. The element’s src points at a blob or remote URL containing the synthesized audio.
Convert text to speech (Shorthand)
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<button id="play">Speak!</button>
<script>
document.getElementById('play').addEventListener('click', ()=>{
puter.ai.txt2speech(`Hello world! Puter is pretty amazing, don't you agree?`).then((audio)=>{
audio.play();
});
});
</script>
</body>
</html>
Convert text to speech using options
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<button id="play">Speak with options!</button>
<script>
document.getElementById('play').addEventListener('click', ()=>{
puter.ai.txt2speech(`Hello world! This is using a neural voice.`, {
voice: "Joanna",
engine: "neural",
language: "en-US"
}).then((audio)=>{
audio.play();
});
});
</script>
</body>
</html>
Use OpenAI voices
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<button id="play">Use OpenAI voice</button>
<script>
document.getElementById('play').addEventListener('click', async ()=>{
const audio = await puter.ai.txt2speech(
"Hello! This sample uses the OpenAI alloy voice.",
{
provider: "openai",
model: "gpt-4o-mini-tts",
voice: "alloy",
response_format: "mp3",
instructions: "Sound cheerful but not overly fast."
}
);
audio.play();
});
</script>
</body>
</html>
Use ElevenLabs voices
<html>
<body>
<script src="https://js.puter.com/v2/"></script>
<button id="play">Use ElevenLabs voice</button>
<script>
document.getElementById('play').addEventListener('click', async ()=>{
const audio = await puter.ai.txt2speech(
"Hello! This sample uses an ElevenLabs voice.",
{
provider: "elevenlabs",
model: "eleven_multilingual_v2",
voice: "21m00Tcm4TlvDq8ikWAM",
output_format: "mp3_44100_128"
}
);
audio.play();
});
</script>
</body>
</html>
Compare different engines
<html>
<head>
<style>
body { font-family: Arial, sans-serif; max-width: 600px; margin: 0 auto; padding: 20px; }
textarea { width: 100%; height: 80px; margin: 10px 0; }
button { margin: 5px; padding: 10px 15px; cursor: pointer; }
.status { margin: 10px 0; padding: 5px; font-size: 14px; }
</style>
</head>
<body>
<script src="https://js.puter.com/v2/"></script>
<h1>Text-to-Speech Engine Comparison</h1>
<textarea id="text-input" placeholder="Enter text to convert to speech...">Hello world! This is a test of the text-to-speech engines.</textarea>
<div>
<button onclick="playAudio('standard')">Standard Engine</button>
<button onclick="playAudio('neural')">Neural Engine</button>
<button onclick="playAudio('generative')">Generative Engine</button>
</div>
<div id="status" class="status"></div>
<script>
const textInput = document.getElementById('text-input');
const statusDiv = document.getElementById('status');
async function playAudio(engine) {
const text = textInput.value.trim();
if (!text) {
statusDiv.textContent = 'Please enter some text first!';
return;
}
if (text.length > 3000) {
statusDiv.textContent = 'Text must be less than 3000 characters!';
return;
}
statusDiv.textContent = `Converting with ${engine} engine...`;
try {
const audio = await puter.ai.txt2speech(text, {
voice: "Joanna",
engine: engine,
language: "en-US"
});
statusDiv.textContent = `Playing ${engine} audio`;
audio.play();
} catch (error) {
statusDiv.textContent = `Error: ${error.message}`;
}
}
</script>
</body>
</html>