TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour `speakingRate` nor any settings #4148

nickaws · 2023-04-03T04:37:03Z

Thanks for stopping by to let us know something could be better!

PLEASE READ: If you have a support contract with Google, please create an issue in the support console instead of filing on GitHub. This will ensure a timely response.

Please run down the following list and make sure you've tried the usual "quick fixes":

Search the issues already opened: https://github.com/GoogleCloudPlatform/google-cloud-node/issues
Search StackOverflow: http://stackoverflow.com/questions/tagged/google-cloud-platform+node.js
Check our Troubleshooting guide: https://googlecloudplatform.github.io/google-cloud-node/#/docs/guides/troubleshooting
Check our FAQ: https://googlecloudplatform.github.io/google-cloud-node/#/docs/guides/faq

If you are still having issues, please be sure to include as much information as possible:

Environment details

which product (packages/*): @google-cloud/text-to-speech"
OS: Ventura 13.0.1
Node.js version: v19.8.1
npm version: 9.5.1
google-cloud-node version: 4.2.1

Steps to reproduce

Please include any and all code and/or steps related to reproducing the bug.



// Copyright 2023 Google LLC
(from google examples)


'use strict';

function main() {

    const input = { text: "Hey there! I'd be delighted to share with you the latest advancements in car navigation systems. There's so much to talk about, so buckle up and let's get started!" }
    /**
     *  Required. The configuration of the synthesized audio.
     */
    const audioConfig = {
        audioEncoding: "MP3",
        speakingRate: "0.5"
    }
    /**
     *  Specifies a Cloud Storage URI for the synthesis results. Must be
     *  specified in the format: `gs://bucket_name/object_name`, and the bucket
     *  must already exist.
     */
    // const outputGcsUri = 'abc123'
    /**
     *  The desired voice of the synthesized audio.
     */
    const voice = {
        name: "en-US-Wavenet-J",
        languageCode: 'en-US'
    }

    // Imports the Texttospeech library
    const { TextToSpeechLongAudioSynthesizeClient } = require('@google-cloud/text-to-speech').v1;

    // Instantiates a client
    const texttospeechClient = new TextToSpeechLongAudioSynthesizeClient();

    async function callSynthesizeLongAudio() {
        // Construct request
        const request = {
            voice,
            input,
            audioConfig,
            outputGcsUri: "gs://somebucket/file.mp3"
        };

        // Run request
        const [operation] = await texttospeechClient.synthesizeLongAudio(request);
        const [response] = await operation.promise();
        console.log(response);
    }

    callSynthesizeLongAudio();
}

process.on('unhandledRejection', err => {
    console.error(err.message);
    process.exitCode = 1;
});
main();

Speaking rate nor pitch are honoured.

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

The text was updated successfully, but these errors were encountered:

danielbankhead · 2023-04-08T02:27:04Z

Hey @nickaws,

speakingRate and pitch should be numbers, not strings:

google-cloud-node/packages/google-cloud-texttospeech/protos/protos.d.ts

Lines 786 to 790 in 94ea195

    
           /** AudioConfig speakingRate. */ 
        
           public speakingRate: number; 
        
           /** AudioConfig pitch. */ 
        
           public pitch: number;

nickaws · 2023-04-10T16:20:24Z

Hi @danielbankhead ! this actually made no difference, and even throws "'Request contains an invalid argument.'," with the code above (which worked sometimes)

danielbankhead · 2023-04-10T17:33:52Z

@nickaws to clarify, the request should look like this:

const operation = await texttospeechClient.synthesizeLongAudio({
  voice: {
    name: "en-US-Wavenet-J",
    languageCode: 'en-US'
  },
  input,
  audioConfig: {
    audioEncoding: "MP3",
    speakingRate: 0.5,
    // pitch: 1,
  },
  outputGcsUri: "gs://somebucket/file.mp3"
});

Does this request not work for you?

which worked sometimes

I'm not sure how a request parameter would fail sometimes; are there any additional details here?

nickolivera · 2023-04-10T17:44:12Z

@danielbankhead I am very sorry for the cryptic message.

The code I put when I opened the ticket, generated an mp3 but crashed with "Request contains an invalid argument.'" right now, the same code, with the new request works sometimes (it does generate the mp3, but just sometimes, i am varying the name of the mp3 file of course), sometimes fails with -- using latest sdk.

[
  {
    voice: { name: 'en-US-Wavenet-J', languageCode: 'en-US' },
    input: {
      text: "Hey there! I'd be delighted to share with you the latest advancements in car navigation systems. There's so much to talk about, so buckle up and let's get started!"
    },
    audioConfig: { audioEncoding: 'MP3', speakingRate: 0.5 },
    outputGcsUri: 'gs://xxx/file2.mp3'
  },
  Metadata {
    internalRepr: Map(2) {
      'x-goog-api-client' => [Array],
      'x-goog-request-params' => [Array]
    },
    options: {}
  },
  { deadline: 2023-04-10T19:06:05.200Z },
  [Function (anonymous)]
]
3 INVALID_ARGUMENT: Request contains an invalid argument.

    at callErrorFromStatus (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/call.js:31:19)
    at Object.onReceiveStatus (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/client.js:192:76)
    at Object.onReceiveStatus (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:360:141)
    at Object.onReceiveStatus (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:323:181)
    at /Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/resolving-call.js:94:78
    at process.processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
    at ServiceClientImpl.makeUnaryRequest (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/client.js:160:34)
    at ServiceClientImpl.<anonymous> (/Users/nick/Development/text-to-speech-node/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
    at /Users/nick/Development/text-to-speech-node/node_modules/@google-cloud/text-to-speech/build/src/v1/text_to_speech_long_audio_synthesize_client.js:200:29
    at /Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/normalCalls/timeout.js:44:16
    at LongrunningApiCaller._wrapOperation (/Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/longRunningCalls/longRunningApiCaller.js:55:16)
    at /Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/longRunningCalls/longRunningApiCaller.js:46:25
    at OngoingCall.call (/Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/call.js:67:27)
    at LongrunningApiCaller.call (/Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/longRunningCalls/longRunningApiCaller.js:45:19)
    at /Users/nick/Development/text-to-speech-node/node_modules/google-gax/build/src/createApiCall.js:84:30
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
  code: 3,
  details: 'Request contains an invalid argument.',
  metadata: Metadata { internalRepr: Map(0) {}, options: {} }

We also were looking at https://github.com/googleapis/google-api-nodejs-client/blob/main/discovery/texttospeech-v1.json#L175 seems there's a parameter missing, but what is funny, is that something it does generate the mp3.

Full code

nick@nick text-to-speech-node % node -v
v19.8.1
nick@nick text-to-speech-node % cat package.json| grep speech
"@google-cloud/text-to-speech": "^4.2.1"


'use strict';

function main() {

    const input = { text: "Hey there! I'd be delighted to share with you the latest advancements in car navigation systems. There's so much to talk about, so buckle up and let's get started!" }

    const audioConfig = {
        audioEncoding: "MP3",
        // speakingRate: 1.0
    }

    const voice = {
        name: "en-US-Wavenet-J",
        languageCode: 'en-US'
    }

    // Imports the Texttospeech library
    const { TextToSpeechLongAudioSynthesizeClient } = require('@google-cloud/text-to-speech').v1;

    // Instantiates a client
    const texttospeechClient = new TextToSpeechLongAudioSynthesizeClient();

    async function callSynthesizeLongAudio() {
        // Construct request
        const request = {

            voice,
            input,
            audioConfig,
            outputGcsUri: "gs://xxx/file32222.mp3"
        };


        // Run request
        // const [operation] = await texttospeechClient.synthesizeLongAudio(request);
        const [operation] = await texttospeechClient.synthesizeLongAudio({
            voice: {
                name: "en-US-Wavenet-J",
                languageCode: 'en-US'
            },
            input,
            audioConfig: {
                audioEncoding: "MP3",
                speakingRate: 0.5,
                // pitch: 1,
            },
            outputGcsUri: "gs://xxx/filexxxxxxxx2.mp3"
        });


        const [response] = await operation.promise();
        console.log(response);
    }

    callSynthesizeLongAudio();
}

process.on('unhandledRejection', err => {
    console.error(err.message);
    process.exitCode = 1;
});
main();

danielbankhead self-assigned this Apr 8, 2023

danielbankhead added type: question Request for information or clarification. Not an issue. priority: p3 Desirable enhancement or fix. May not be included in next release. labels Apr 8, 2023

danielbankhead closed this as completed Apr 8, 2023

danielbankhead assigned sofisl and unassigned danielbankhead Apr 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour `speakingRate` nor any settings #4148

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour `speakingRate` nor any settings #4148

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour speakingRate nor any settings #4148

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour speakingRate nor any settings #4148

Comments

Environment details

Steps to reproduce

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour `speakingRate` nor any settings #4148

TextToSpeechLongAudioSynthesizeClient AudioConfig does not honour `speakingRate` nor any settings #4148