puter.ai.speech2speech()

Websites Puter Apps Node.js Workers

Convert an existing recording into another voice while preserving timing, pacing, and delivery. This helper wraps the ElevenLabs voice changer endpoint so you can swap voices locally, from remote URLs, or with in-memory blobs.

Syntax

puter.ai.speech2speech(source)
puter.ai.speech2speech(source, options)
puter.ai.speech2speech(options)
puter.ai.speech2speech(source, testMode)

Parameters

source (String | File | Blob) (required unless provided in options)

Audio to convert. Accepts:

  • A Puter path such as ~/recordings/line-read.wav
  • A File or Blob (converted to data URL automatically)
  • A data URL (data:audio/wav;base64,...)
  • A remote HTTPS URL

options (Object) (optional)

Fine-tune the conversion:

  • audio / file (String | File | Blob): Alternate way to provide the source input.
  • voice / voiceId / voice_id (String): Target ElevenLabs voice ID. Defaults to the configured ElevenLabs voice (Rachel sample if unset).
  • model / modelId / model_id (String): Voice-changer model. Defaults to eleven_multilingual_sts_v2. You can also use eleven_english_sts_v2 for English-only inputs.
  • output_format / outputFormat (String): Desired output codec and bitrate, e.g. mp3_44100_128, opus_48000_64, or pcm_48000. Defaults to mp3_44100_128.
  • voice_settings / voiceSettings (Object|String): ElevenLabs voice settings payload (e.g. {"stability":0.5,"similarity_boost":0.75}).
  • seed (Number): Randomization seed for deterministic outputs.
  • remove_background_noise / removeBackgroundNoise (Boolean): Apply background noise removal.
  • file_format / fileFormat (String): Input file format hint (e.g. pcm_s16le_16) for raw PCM streams.
  • optimize_streaming_latency / optimizeStreamingLatency (Number): Latency optimization level (0–4) forwarded to ElevenLabs.
  • enable_logging / enableLogging (Boolean): Forwarded to ElevenLabs to toggle zero-retention logging behavior.

testMode (Boolean) (optional)

When true, skips the live API call and returns a sample audio clip so you can build UI without spending credits.

Return value

A Promise that resolves to an HTMLAudioElement. Call audio.play() or use the element’s src URL to work with the generated voice clip.

Examples

Change the voice of a sample clip

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        (async () => {
            const sampleUrl = 'https://puter-sample-data.puter.site/tts_example.mp3';
            const audio = await puter.ai.speech2speech(sampleUrl, {
                voice: '21m00Tcm4TlvDq8ikWAM',
                outputFormat: 'opus_48000_64',
            });
            audio.play();
        })();
    </script>
</body>
</html>

Convert a recording stored as a file

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <input type="file" id="input" accept="audio/*" />
    <button id="convert">Change voice</button>
    <audio id="player" controls></audio>
    <script>
        document.getElementById('convert').onclick = async () => {
            const file = document.getElementById('input').files[0];
            if (!file) return alert('Pick an audio file first.');

            const audio = await puter.ai.speech2speech(file, {
                voice: '21m00Tcm4TlvDq8ikWAM', // Rachel sample voice
                model: 'eleven_multilingual_sts_v2',
                output_format: 'mp3_44100_128',
                removeBackgroundNoise: true,
            });

            document.getElementById('player').src = audio.toString();
            audio.play();
        };
    </script>
</body>
</html>

Develop with test mode

<html>
<body>
    <script src="https://js.puter.com/v2/"></script>
    <script>
        (async () => {
            const preview = await puter.ai.speech2speech('~/any-file.wav', true);
            console.log('Sample audio URL:', preview.toString());
        })();
    </script>
</body>
</html>