Jingdong cloud provides enough ai apis, and they are encapsulated by http. Users can easily access the ai capability of Jingdong cloud in their own systems. Today, I'm going to introduce how to write a few codes and use Jingdong Yun's voice synthesis api to read the text aloud in the web page. The result is small delay, support mainstream devices, beautiful tone, and can switch between boys and girls.
Final effect
Ultimately, Wechat opens the link and clicks the play button to read the text aloud.
Introduction to Api
Jingdong Cloud AI API uses Restful interface style, and provides the sdk of java and python. sdk can easily encapsulate parameters and call api to get data.
In order to improve the response speed of the caller, the voice synthesis api uses the mode of piecewise synthesis, so the voice data is written back to the front end in the form of data stream when calling in the back-end logic.
Getting AK/SK
To visit Beijing East Cloud api, ak sk needs to be acquired and used in conjunction with sdk.
Enter Jingdong Cloud Console - Account Management - Access Key Management, create and obtain Access Key.
Back-end audio stream synthesis
This paper presents the source code of the back end, implements a controller and develops a get request method. The logic of parameter encapsulation all refines a separate method. The logic structure of the code is simple and easy to understand. The code uses fastJson to process parameters, and refers to Jingdong cloud sdk. The rest are jdk's own api, which relies very little.
1 import com.alibaba.fastjson.JSON; 2 import com.alibaba.fastjson.JSONObject; 3 import com.wxapi.WxApiCall.WxApiCall; 4 import com.wxapi.model.RequestModel; 5 6 import org.springframework.stereotype.Controller; 7 import org.springframework.web.bind.annotation.GetMapping; 8 import org.springframework.web.bind.annotation.RequestHeader; 9 10 import javax.servlet.http.HttpServletRequest; 11 import javax.servlet.http.HttpServletResponse; 12 import java.io.IOException; 13 import java.io.OutputStream; 14 import java.util.Base64; 15 import java.util.HashMap; 16 import java.util.Map; 17 18 @Controller 19 public class TTSControllerExample { 20 //url appkey secretkey 21 private static final String url = "https://aiapi.jdcloud.com/jdai/tts"; 22 private static final String appKey = ""; 23 private static final String secretKey = ""; 24 25 @GetMapping("/tts/stream/example") 26 public void ttsStream( 27 @RequestHeader(value = "Range", required = false) String range, 28 HttpServletRequest req, 29 HttpServletResponse resp) { 30 31 //Take header Range: bytes=0-1 with the first confirmation request of safari, and write back 1 byte data to prevent errors 32 if ("bytes=0-1".equals(range)) { 33 try { 34 byte[] temp = new byte['a']; 35 resp.setHeader("Content-Type", "audio/mp3"); 36 OutputStream out = resp.getOutputStream(); 37 out.write(temp); 38 } catch (IOException e) { 39 e.printStackTrace(); 40 } 41 return; 42 } 43 //Encapsulation input parameters 44 Map queryMap = processQueryParam(req); 45 String text = req.getParameter("text"); 46 //Encapsulating api call request message 47 RequestModel requestModel = getBaseRequestModel(queryMap, text); 48 try { 49 //Write back audio data to the front end 50 writeTtsStream(resp, requestModel); 51 } catch (IOException e) { 52 e.printStackTrace(); 53 } 54 } 55 56 /** 57 * Encapsulate the front-end input parameters as request objects for api calls, and set url appkey secaretKey 58 * @param queryMap 59 * @param bodyStr 60 * @return 61 */ 62 private RequestModel getBaseRequestModel(Map queryMap, String bodyStr) { 63 RequestModel requestModel = new RequestModel(); 64 requestModel.setGwUrl(url); 65 requestModel.setAppkey(appKey); 66 requestModel.setSecretKey(secretKey); 67 requestModel.setQueryParams(queryMap); 68 requestModel.setBodyStr(bodyStr); 69 return requestModel; 70 } 71 72 /** 73 * Streaming api calls require sequenceId to be incremented sequenceId, which is used to set the request object sequenceId 74 * @param sequenceId 75 * @param requestModel 76 * @return 77 */ 78 private RequestModel changeSequenceId(int sequenceId, RequestModel requestModel) { 79 requestModel.getQueryParams().put("Sequence-Id", sequenceId); 80 return requestModel; 81 } 82 83 /** 84 * Encapsulate request parameters in request as queryMap in the request object called by api 85 * @param req 86 * @return 87 */ 88 private Map processQueryParam(HttpServletRequest req) { 89 String reqid = req.getParameter("reqid"); 90 int tim = Integer.parseInt(req.getParameter("tim")); 91 String sp = req.getParameter("sp"); 92 93 JSONObject parameters = new JSONObject(8); 94 parameters.put("tim", tim); 95 parameters.put("sr", 24000); 96 parameters.put("sp", sp); 97 parameters.put("vol", 2.0); 98 parameters.put("tte", 0); 99 parameters.put("aue", 3); 100 101 JSONObject property = new JSONObject(4); 102 property.put("platform", "Linux"); 103 property.put("version", "1.0.0"); 104 property.put("parameters", parameters); 105 106 Map<String, Object> queryMap = new HashMap<>(); 107 //Access parameters 108 queryMap.put("Service-Type", "synthesis"); 109 queryMap.put("Request-Id", reqid); 110 queryMap.put("Protocol", 1); 111 queryMap.put("Net-State", 1); 112 queryMap.put("Applicator", 1); 113 queryMap.put("Property", property.toJSONString()); 114 115 return queryMap; 116 } 117 118 /** 119 * Loop call api to write back audio data to response object 120 * @param resp 121 * @param requestModel 122 * @throws IOException 123 */ 124 public void writeTtsStream(HttpServletResponse resp, RequestModel requestModel) throws IOException { 125 //Segmented Audio Sequence Id Increases from 1 126 int sequenceId = 1; 127 changeSequenceId(sequenceId, requestModel); 128 //Set the content type of the return header to audio/mp3 129 resp.setHeader("Content-Type", "audio/mp3"); 130 //api requests sdk objects 131 WxApiCall call = new WxApiCall(); 132 //Get the output stream for output audio stream 133 OutputStream out = resp.getOutputStream(); 134 call.setModel(requestModel); 135 //Parse the return message to get status 136 String response = call.request(); 137 JSONObject jsonObject = JSON.parseObject(response); 138 JSONObject data = jsonObject.getJSONObject("result"); 139 //For the first request, add a check, and write back 500 error codes to the front end if an error occurs 140 if (data.getIntValue("status") != 0) { 141 resp.sendError(500, data.getString("message")); 142 return; 143 } 144 //Push Actual Audio Data 145 String audio = data.getString("audio"); 146 byte[] part = Base64.getDecoder().decode(audio); 147 out.write(part); 148 out.flush(); 149 //Determine whether it is over, multiple requests correspond to multiple indexes, index < 0 represents the last package 150 if (data.getIntValue("index") < 0) { 151 return; 152 } 153 //Loop the rest of the audio 154 while (data.getIntValue("index") >= 0) { 155 //sequenceid increment 156 sequenceId = sequenceId + 1; 157 changeSequenceId(sequenceId, requestModel); 158 //Request api for new audio data 159 call.setModel(requestModel); 160 response = call.request(); 161 jsonObject = JSON.parseObject(response); 162 data = jsonObject.getJSONObject("result"); 163 audio = data.getString("audio"); 164 part = Base64.getDecoder().decode(audio); 165 //Write back new audio data 166 out.write(part); 167 out.flush(); 168 } 169 } 170 171 172 173 Front end audio Play aloud 174 The front end portion is given in the vue In Modular Development script In part, due to the adoption of html5 Of audio For voice playback, reference is required for compatibility howler.js (npm install howler),The main logic is to stitch a text according to the set parameters and to be read aloud. url,call howler.js In api Play. 175 176 <script> 177 import {Howl, Howler} from 'howler' 178 export default { 179 data() { 180 return { 181 news: { //News content 182 ...... 183 }, 184 role: 1, //0 female voice, 1 male voice 185 speed: 1, //Play speed 186 curIndex: -1, //The order of played paragraphs in all paragraphs is related to user interactive display, not to streaming playback. 187 sound: null, //The only variable on the page that points to the howler instance 188 status: 'empty' //load,pause,stop,empty are only related to user interactive display, but not to streaming playback display 189 } 190 }, 191 methods: { 192 generateUUID () { //Generating uuid 193 let d = Date.now() 194 return 'xxxxxxxx-xxxx-4xxx-yxxx-xxxxxxxxxxxx'.replace(/[xy]/g, c => { 195 let r = (d + Math.random() * 16) % 16 | 0 196 d = Math.floor(d / 16) 197 return (c === 'x' ? r : (r & 0x3) | 0x8).toString(16) 198 }) 199 }, 200 audioSrc (txt) { //Generate links to get audio 201 let content = encodeURI(txt) //Word coding 202 return `http://neuhubdemo.jd.com/api/tts/streamv2?reqid=${ 203 this.generateUUID() // requestID 204 }&text=${ 205 content //Encoded text content 206 }&tim=${ 207 this.role //Men's voice or women's voice 208 }&sp=${ 209 this.speed //Play speed 210 }` 211 }, 212 /** 213 * Getting Streaming Audio Corresponding to Text 214 * 215 * Using howler can solve the compatibility problem of some mobile browsers (eg:UC). 216 * But to solve the compatibility problem of Wechat and safari on ios, 217 * The backend is required to control the request through the header field {range: bytes=0-1} 218 * @param {String Text to be transferred to audio} txt 219 */ 220 howlerPlay(txt) { 221 if (this.sound) { 222 this.sound.unload() //If sound has a value, the original object is destroyed 223 } 224 let self = this 225 this.status = 'load' 226 this.sound = new Howl({ 227 src: `${this.audioSrc(txt)}`, 228 html5: true, //Must! A live stream can only be played through HTML5 Audio. 229 format: ['mp3', 'aac'], 230 //The following onplay, onpause and onend are all related to control display 231 onplay() { 232 self.status = 'pause' 233 }, 234 onpause: function() { 235 self.status = 'stop' 236 }, 237 onend: function() { 238 self.status = 'stop' 239 } 240 }); 241 this.sound.play() 242 }, 243 //Controlling user interaction 244 play (txt, index) { 245 if (this.curIndex === index) { 246 if (this.status === 'stop') { 247 this.sound.play() 248 } else { 249 this.sound.pause() 250 } 251 } else { 252 this.curIndex = index 253 this.howlerPlay(txt) 254 } 255 } 256 } 257 } 258 </script>
Are you eager to try after reading this operation document? Want to know more about AI?
This Saturday, we have prepared "Jingdong Cloud Technology Salon AI Specialty" for you, from "Intelligent Retail" to "Unmanned Storage" to reveal the practice and application of Jingdong Artificial Intelligence Technology. There will be technical experts to answer your questions.
Click“ Read the original text ” You can sign up for free!