Small program voice synthesis tts docking multi platform (iFLYTEK, Spitzer, Baidu)

Keywords: Front-end Spring SDK Session JSON

Features of applet functions

  1. Text to speech
  2. Multi platform and multi pronunciation
  3. Adjustable speech speed
  4. Audio download available
  5. Conscience products without advertisement

Small program code

Connected to online voice recognition service

  1. Sibi dui platform (more than 40 free speakers)
  2. IFLYTEK open platform (5 free speakers)
  3. Baidu voice (optional for 4 free speakers)

Screenshot of applet

Server main code

class TTSController extends Controller {
  async tts () {
    let params = this.ctx.query
    let result = null
    // Call different interfaces according to the plat parameter
    if (params.plat === 'xf') {
      result = await this.ctx.service.xftts.getTts(params)
    } else if (params.plat === 'baidu') {
      result = await this.ctx.service.baidutts.getTts(params)
    } else {
      result = await this.ctx.service.aispeechtts.getTts(params)
    }
    // Set the response type so that the client receives a file stream
    this.ctx.response.type = 'audio/mpeg'
    this.ctx.body = result
  }
}

Applet client template code (mpvue used)

<template>
  <div class="container">
    <div class="preview">
      <textarea :class="textAreaFocus? 'focus' : ''" 
      auto-height @focus="bindTextAreaFocus" 
      @blur="bindTextAreaBlur" placeholder="Please enter text" 
      v-model="text"  maxlength="256"/>
    </div>
    <div class="setting">
      <picker @change="bindPlatChange" v-model="platIndex" range-key="name" :range="platArr">
        <div class="item">
          <div class="label">Selection platform</div>
          <div class="value voice">
            {{platArr[platIndex].name}}
          </div>
        </div>
      </picker>
      <picker @change="bindPickerChange" v-model="index" range-key="name" :range="array">
        <div class="item">
          <div class="label">Choose speaker</div>
          <div class="value voice">
            {{array[index].name}}
          </div>
        </div>
      </picker>
      <div class="item speed">
        <div class="label">Speed of speech adjustment</div>
        <div class="value">
          <slider @change="onSpeedChange" :value="speedObj.default" :step='speedObj.step' activeColor="#6F8FFF" :min="speedObj.min" :max="speedObj.max" show-value />
        </div>
      </div>
    </div>
    <div style="height: 140rpx;">
      <div class="btn-group">
        <div class="item"><button @click="audioPlay" type="main">Play synthetic voice</button> </div>
        <div class="item"> <button @click="audioDownload" type="submain">Copy link Download</button> </div>
      </div>
    </div>
    <div class="desc">
      Explain: tts It's English. text to speech Short for text to speech technology
      <contact-button 
        type="default-light"
        session-from="weapp">Contact customer service
      </contact-button>
    </div>
  </div>
</template>

script code

<script>
import voiceIdArray from './voiceIdArray'

export default {

  data () {
    return {
      array: voiceIdArray.aispeech,
      platArr: [{id: 'xf', name: 'HKUST flight'}, {id: 'aispeech', name: 'AISpeech'}, {id: 'baidu', name: 'Baidu'}],
      platIndex: 1,
      index: 26,
      text: `The spring breeze blows all over the land, and the spring breeze blows all over the land.\n The Chinese people are really striving, really striving, the people are really striving.\n The world is crazy. Mice are bridesmaids for cats.\n Qidelong, Qidong strong.\n Zideron's thumping is thumping.`,
      voiceId: 'lili1f_diantai',
      speed: 1,
      textAreaFocus: false,
      audioCtx: null,
      ttsServer: 'https://tts.server.com',
      audioSrc: '',
      downloadUrl: '',
      xfSpeedObj: {
        min: 0,
        max: 100,
        default: 50,
        step: 1
      },
      aispeechSpeedObj: {
        min: 0.7,
        max: 2,
        default: 1,
        step: 0.1
      },
      baiduSpeedObj: {
        min: 0,
        max: 9,
        default: 5,
        step: 1
      },
      speedObj: {}
    }
  },
  watch: {
    platIndex (newVal, oldVal) {
      if (newVal === 2) {
        this.array = voiceIdArray.baidu
        this.index = 0
        this.speedObj = this.baiduSpeedObj
      }
      if (newVal === 1) {
        this.array = voiceIdArray.aispeech
        this.index = 26
        this.speedObj = this.aispeechSpeedObj
      }
      if (newVal === 0) {
        this.array = voiceIdArray.xf
        this.index = 0
        this.speedObj = this.xfSpeedObj
      }
    }
  },
  onShareAppMessage () {
    return {
      title: 'Text to speech service, multiple speakers available'
    }
  },
  methods: {
    onSpeedChange (e) {
      this.speedObj.default = e.target.value
    },
    bindPlatChange (e) {
      this.platIndex = e.target.value * 1
    },
    bindPickerChange (e) {
      this.index = e.target.value
    },
    getAudioSrc () {
      if (this.text === '') {
        return false
      }
      const speed = this.speedObj.default
      const voiceId = this.array[this.index].id
      const plat = this.platArr[this.platIndex].id
      return encodeURI(`${this.ttsServer}/tts?plat=${plat}&voiceId=${voiceId}&speed=${speed}&text=${this.text}`)
    },
    getDownloadUrl () {
      const plat = this.platArr[this.platIndex].id
      const voiceId = this.array[this.index].id
      wx.showLoading({
        title: 'Loading'
      })
      wx.request({
        url: 'https://tts.server.com/getdownloadurl',
        data: {
          plat: plat,
          voiceId: voiceId,
          speed: this.speedObj.default,
          text: this.text
        },
        header: {
          'content-type': 'application/json' // Default value
        },
        success (res) {
          wx.hideLoading()
          wx.setClipboardData({
            data: res.data.short_url,
            success (res) {
              wx.showToast({
                title: 'Link copied please download with browser(ios End failed to download)',
                icon: 'none',
                duration: 3000
              })
            }
          })
        }
      })
    },
    audioPlay () {
      this.audioCtx.src = this.getAudioSrc()
      if (!this.audioCtx.src) {
        wx.showToast({
          title: 'Please enter text first',
          icon: 'none',
          duration: 2000
        })
        return false
      }
      wx.showLoading({
        title: 'Loading'
      })
      this.audioCtx.play()
    },
    audioDownload () {
      this.getDownloadUrl()
    },
    bindTextAreaBlur (e) {
      this.textAreaFocus = false
      this.text = e.target.value
    },
    bindTextAreaFocus () {
      this.textAreaFocus = true
    }
  },

  created () {
    this.speedObj = this.aispeechSpeedObj
  },
  mounted () {
    this.audioCtx = wx.createInnerAudioContext()
    this.audioCtx.onEnded((res) => {
      wx.hideLoading()
    })
    this.audioCtx.onPlay((res) => {
      wx.hideLoading()
    })
    wx.showShareMenu({
      withShareTicket: true
    })
  }
}
</script>

In the interface docking process, Baidu's is the most convenient because there is a SDK that can be used directly, iFLYTEK's most troublesome need to do parameter encryption on its own, while sipi dui doesn't provide an SDK, but the detailed docking process of document writing is also very convenient and fast.

What can't be solved at present is that the problem that can't be downloaded directly in the applet can only be solved by providing a link, and then the user opens a browser to download by himself (iPhone seems to have no solution).

Posted by RW on Sun, 08 Dec 2019 04:19:19 -0800