Creating text-to-speech content

The Yoto Labs Content API allows you to create audio content from text directly. Instead of uploading pre-recorded audio files, it lets you provide text that will be converted to speech via the ElevenLabs API and automatically added to your Yoto library.

Create a text-to-speech playlist

Define your content object as usual, but set the track type to elevenlabs and put text in the trackUrl property:

const chapters = [
  {
    key: 'chapter1',
    title: 'Chapter 1',
    tracks: [
      {
        key: 'track1',
        title: 'The Friendly Dragon',
        trackUrl:
          'Once upon a time, in a magical forest, there lived a friendly dragon who loved to read books.',
        type: 'elevenlabs',
        display: {
          icon16x16: 'yoto:#ZuVmuvnoFiI4el6pBPvq0ofcgQ18HjrCmdPEE7GCnP8',
        },
      },
    ],
    display: {
      icon16x16: 'yoto:#AjJaUh665wfnb72_y5uQ3M0w3JtobwIVfGua_A_j6i8',
    },
  },
];

const content = {
  title: 'My Audio Story',
  content: { chapters },
  metadata: {
    title: 'My Audio Story',
    description: 'A story about a friendly dragon',
  },
};

Send your content to the Labs API

const response = await fetch(
  'https://labs.api.yotoplay.com/content/job?voiceId=JBFqnCBsd6RMkjVDRZzb',
  {
    method: 'POST',
    headers: {
      Authorization: `Bearer ${accessToken}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify(content),
  }
);

const { job } = await response.json();
console.log('Text-to-speech job created successfully!');

This creates an asynchronous processing job. The job object contains:

jobId: A unique identifier for tracking the job
status: The current status (queued, processing, completed, or failed)
progress: Object showing total, completed, and failed track counts

The processing happens asynchronously. You can check the job status using the jobId and once completed, the content will be automatically created in your Yoto library.

Voice IDs

You can set a voice ID in two ways:

Query parameter: Add ?voiceId={elevenLabsVoiceId} to set a default voice for all tracks
Per track: Set voiceId property on individual tracks to override the default

Updating existing content

To update an existing card, add the cardId property to your content object:

const content = {
  title: 'My Audio Story',
  content: { chapters },
  metadata: {
    title: 'My Audio Story',
    description: 'A story about a friendly dragon',
  },
};

// Specify a cardId to update an existing card
content.cardId = 'your-card-id-here';

Then submit to the same endpoint - the Labs API will update the existing card instead of creating a new one.

Here’s the complete code:

export const createTextToSpeechPlaylist = async ({
  title = 'My Audio Story',
  accessToken,
  cardId,
}) => {
  const chapters = [
    {
      key: 'chapter1',
      title: 'Chapter 1',
      tracks: [
        {
          key: 'track1',
          title: 'The Friendly Dragon',
          trackUrl:
            'Once upon a time, in a magical forest, there lived a friendly dragon who loved to read books.',
          type: 'elevenlabs',
          display: {
            icon16x16: 'yoto:#ZuVmuvnoFiI4el6pBPvq0ofcgQ18HjrCmdPEE7GCnP8',
          },
        },
      ],
      display: {
        icon16x16: 'yoto:#AjJaUh665wfnb72_y5uQ3M0w3JtobwIVfGua_A_j6i8',
      },
    },
  ];

  const content = {
    title: title,
    content: { chapters },
    metadata: {
      title: title,
      description: 'A story about a friendly dragon',
    },
  };

  if (cardId) {
    // specify a cardId to update an existing card
    content.cardId = cardId;
  }

  const jobResponse = await fetch(
    'https://labs.api.yotoplay.com/content/job?voiceId=JBFqnCBsd6RMkjVDRZzb',
    {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${accessToken}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify(content),
    }
  );

  const { job } = await jobResponse.json();

  console.log('Text-to-speech job created successfully!');

  return job;
};