contexts
I've always been interested in speech synthesis systems, and always wanted to be able to synthesize a little bit of content for myself, such as synthesizing novels, broadcasting my downloaded eBooks to me, etc.
speech synthesis system
In fact, it is a tool based on speech synthesis, but this thing is due to many manufacturers have provided the form of API, so the development difficulty is greatly reduced, only need to call a few API can realize their own speech synthesis tool; sparrow is small, all the organs are complete. On a larger scale, this is a small speech synthesis system.
preliminary
First we need to have the
- Anaconda
- Python 3.7
- visual studio code
move
Here we useXunfei (telecommunications company)WebAPI interface for open platforms.
First we go to the console and create an application
Once created, click on the app to access it and have a detailed section for that app.
Click on Speech Synthesis on the left, then go to the next level Online Speech Synthesis (streaming version)
On the upper right side, we need to get 3 things:
- APPID
- APISecret
- APIKey
code implementation
Okay, let's move on to the code implementation. First, we'll install the two libraries we need.
pip install websocket-client pip install playsound
Next we define a class play, containing 4 functions
class play: def __init__(self): # Initialization functions def play_sound(self):#Play Audio Functions def select_vcn(self,*arg):#Select drop-down box to set the pronouncer def xfyun_tts(self):#Perform speech synthesis
Here, you need to fill in the appid, appkey, and appsecret that you just obtained from the Xunfei Open Platform console.
def __init__(self): self.APP_ID = 'xxx' #Please fill in your appid self.API_KEY = 'xxx' #Please fill in your appkey self.SECRET_KEY = 'xxx' #Please fill in your appsecret =() # Initialization window ("Speech Synthesis System") #Window name ("600x550") #Set the window size (0,0) #(width=True,height=True)#Set the window to be variable, width is not variable, height is variable, default is True. =(,text='Please select the voice pronouncer')#hashtag =(,width=77,height=30) #Multi-line text boxes =(, width=12) #DownListBox # Set the content of the drop-down list box ['values']=("Sweet Female Voice - Xiao Yan","Kindly Male Voice - Hsu Ku.","Knowledgeable Female Voice - Pimmi", "Lovely Children's Voices - Xiao Bao Xu","Kindly Female Voice - Julia Jr.") (0) # Set the current selection status to 0, the first item. ("<<ComboboxSelected>>", self.select_vcn) self.tk_tts_file=(,text='Generate filename') self.b1=(, text='Perform speech synthesis', width=10,height=1,command=self.xfyun_tts) #buttons self.tk_play=(, text='Play', width=10,height=1,command=self.play_sound) #buttons # Location of individual components self.tk_tts_file.place(x=30,y=500) self.(x=300,y=500) self.tk_play.place(x=400,y=500) (x=30,y=30) (x=154,y=30) (x=30,y=60) ()
When the drop-down list is selected, set the corresponding pronouncer
def select_vcn(self,*arg): if ()=='Sweet female voice - Xiao Yan': ="xiaoyan" elif ()=='Intimate male voice - Xu Long': ="aisjiuxu" elif ()=='Intelligent Female Voice - Pim': ="aisxping" elif ()=='Lovely Children's Voice - Xu Xiaobao': ="aisbabyxu" elif ()=='Kind Female Voice - Julia Jr.': ="aisjinger" print()
Next, we're going to modify the Python demo that comes with Xunfei to make it easier to use
# -*- coding:utf-8 -*- # # author: iflytek # # This demo was tested on Windows + Python 3.7. # The third-party libraries and their versions that were installed when this demo ran successfully are as follows: # cffi==1.12.3 # gevent==1.4.0 # greenlet==0.4.15 # pycparser==2.19 # six==1.12.0 # websocket==0.2.1 # websocket-client==0.56.0 # Synthesis of minor languages requires transmission of minor text, use of minor speakers vcn, tte=unicode, and modification of text encoding methods # Error code link: /document/error-code (must see when code returns an error code) # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # import websocket import datetime import hashlib import base64 import hmac import json from import urlencode import time import ssl from import format_date_time from datetime import datetime from time import mktime import _thread as thread import os import wave STATUS_FIRST_FRAME = 0 # The identification of the first frame STATUS_CONTINUE_FRAME = 1 # Intermediate frame identification STATUS_LAST_FRAME = 2 # Marking of the last frame PCM_PATH = "./" class Ws_Param(object): # Initialization def __init__(self): pass def set_tts_params(self, text, vcn): if text != "": = text if vcn != "": = vcn # Business parameters (business), more personalized parameters can be viewed on the official website = {"bgs":1,"aue": "raw", "auf": "audio/L16;rate=16000", "vcn": , "tte": "utf8"} # Use the following method for small languages, where unicode refers to the encoding method of the small end of utf16, i.e. "UTF-16LE"". # = {"status": 2, "text": str(base64.b64encode(('utf-16')), "UTF8")} = {"status": 2, "text": str(base64.b64encode(('utf-8')), "UTF8")} def set_params(self, appid, apiSecret, apiKey): if appid != "": = appid # Public parameters (common) = {"app_id": } if apiKey != "": = apiKey if apiSecret != "": = apiSecret # generate url def create_url(self): url = 'wss:///v2/tts' # Generate timestamps in RFC1123 format now = () date = format_date_time(mktime(())) # Splicing strings signature_origin = "host: " + "" + "\n" signature_origin += "date: " + date + "\n" signature_origin += "GET " + "/v2/tts " + "HTTP/1.1" # Perform hmac-sha256 encryption. signature_sha = (('utf-8'), signature_origin.encode('utf-8'), digestmod=hashlib.sha256).digest() signature_sha = base64.b64encode(signature_sha).decode(encoding='utf-8') authorization_origin = "api_key=\"%s\", algorithm=\"%s\", headers=\"%s\", signature=\"%s\"" % ( , "hmac-sha256", "host date request-line", signature_sha) authorization = base64.b64encode(authorization_origin.encode('utf-8')).decode(encoding='utf-8') # Combine requested authentication parameters into a dictionary v = { "authorization": authorization, "date": date, "host": "" } url = url + '?' + urlencode(v) return url def on_message(ws, message): try: #print(message) try: message =(message) except Exception as e: print("111",e) code = message["code"] sid = message["sid"] audio = message["data"]["audio"] audio = base64.b64decode(audio) status = message["data"]["status"] print(code, sid, status) if status == 2: print("ws is closed") () if code != 0: errMsg = message["message"] print("sid:%s call error:%s code is:%s" % (sid, errMsg, code)) else: with open(PCM_PATH, 'ab') as f: (audio) except Exception as e: print("receive msg,but parse exception:", e) # Handling of websocket errors received def on_error(ws, error): print("### error:", error) # Handling of websocket closures received def on_close(ws): print("### closed ###") # Handling of received websocket connection establishment def on_open(ws): def run(*args): d = {"common": , "business": , "data": , } d = (d) print("------> start sending text data") (d) if (PCM_PATH): (PCM_PATH) thread.start_new_thread(run, ()) def text2pcm(appid, apiSecret, apiKey, text, vcn, fname): wsParam.set_params(appid, apiSecret, apiKey) wsParam.set_tts_params(text, vcn) (False) wsUrl = wsParam.create_url() ws = (wsUrl, on_message=on_message, on_error=on_error, on_close=on_close) ws.on_open = on_open ws.run_forever(sslopt={"cert_reqs": ssl.CERT_NONE}) pcm2wav(PCM_PATH, fname) def pcm2wav(fname, dstname): with open(fname, 'rb') as pcmfile: pcmdata = () print(len(pcmdata)) with (dstname, "wb") as wavfile: ((1, 2, 16000, 0, 'NONE', 'NONE')) (pcmdata) wsParam = Ws_Param()
Eventually a speech synthesis system was realized.
Currently, a variety of cloud computing, cloud services are rapidly developing, the major companies provide a wealth of resources, greatly reducing the threshold of artificial intelligence development, do not need to understand the principles of speech synthesis, it is surprising that you can quickly develop a speech synthesis tool out!
To this article on the Python based on the writing of a speech synthesis system is introduced to this article, more related Python speech synthesis content, please search for my previous articles or continue to browse the following related articles I hope you will support me in the future!