SoFunction
Updated on 2025-03-02

Python uses gTTS to implement text to voice function

First, install the python third-party library:pip install gTTS

gTTS (Google Text-to-Speech), an interface to Google's Text-to-Speech API, provides an easy way to generate languages ​​that sound natural. gTTS supports multiple languages ​​and dialects, making it widely used in multilingual applications.

# Import gTTS library for text-to-speech conversionfrom gtts import gTTS
import os


# Define a function for text-to-speech conversiondef text_to_speech(text, lang='zh-cn'): # The default setting is Chinese language    # Create speech objects using gTTS, which requires incoming text and language code    tts = gTTS(text=text, lang=lang)
    # Define the file name that saves the voice file, which is saved in the current directory    filename = 'speech.mp3'
    # Save voice files    (filename)
    # Return the saved file name for subsequent use    return filename


# Example text, here is a Chinese texttext = "Hello everyone, I'm a programmer"
# Call the text_to_speech function to convert text to speech, and specify the use of Chinesefilename = text_to_speech(text, 'zh-cn')
# Print out the saved file path and confirm that the file has been generatedprint(f"Generated speech saved to {filename}")
("start speech.mp3")

Write all the text you need to convert into a file, put it in the current file directory, and convert it to voice using gTTS:

# Import gTTS libraryfrom gtts import gTTS
import os

# Text to convert
with open("", "r") as f:
    text = ()
# Create a gTTS object, specify text and languagetts = gTTS(text, lang='zh')

# Save as an audio file("output.mp3")

# Play audio files("start output.mp3")

Some problems encountered:

: Failed to connect. Probable cause: Unknown

Error explanation:

: Failed to connect. Probable cause: UnknownThis error comes fromgTTSlibrary, which usually indicates failure when trying to connect to a service (such as text to voice service). The specific reason is unknown, which may be network problems, unavailable services, wrong service address, or other unknown reasons.

Solution:

  • Check network connection: Make sure your device has access to the internet properly.
  • Service status: Check whether the relevant online text to voice service is operating normally, such as Google's text to voice service.
  • Update the library: Make sure yourgTTSThe library is the latest version and can be updated through pip.
  • Proxy settings: If you are using a proxy, make sure the proxy settings are correct.
  • Service address: CheckgTTSWhether the library uses the correct service address.

I'll analyze the most likely cause of network problems. You can try it a few more times.

Method supplement

1.pyttsx3 module

Reference documentation:/en/latest/

Advantages:

1. Complete offline text-to-speech conversion, which can be selected from different voices installed in the system;

2. Control the speed/rate of the voice and adjust the volume;

3. Save voice and audio as a file;

4. Simple, powerful and intuitive API.

Before use, you need to install: pip3 install pyttsx3

Basic use

import pyttsx3
engine = ()
("I will speak this text")
()

Read directly

import pyttsx3
("I will speak this text")

Change voice, rate, and volume

import pyttsx3
engine = () # object creation

""" RATE"""
rate = ('rate')   # getting details of current speaking rate
print (rate)                        #printing current voice rate
('rate', 125)     # setting up new voice rate


"""VOLUME"""
volume = ('volume')   #getting to know current volume level (min=0 and max=1)
print (volume)                          #printing current volume level
('volume',1.0)    # setting up volume level  between 0 and 1

"""VOICE"""
voices = ('voices')       #getting details of current voice
#('voice', voices[0].id)  #changing index, changes voices. o for male
('voice', voices[1].id)   #changing index, changes voices. 1 for female

("Hello World!")
('My current speaking rate is ' + str(rate))
()
()


"""Saving Voice to a file"""
# On linux make sure that 'espeak' and 'ffmpeg' are installed
engine.save_to_file('Hello World', 'test.mp3')
()

-aip

Generate audio files by applying for a voice synthesis account on Baidu's Open Developer Platform. Examples are as follows:

# Download the baidu-aip module and import itfrom aip import AipSpeech
"""Your APPID AK SK"""
APP_ID = 'Your App ID'
API_KEY = 'Your Api Key'
SECRET_KEY = 'Your Secret Key'
client = AipSpeech(APP_ID, API_KEY, SECRET_KEY) #Configure Baidu Voice Client res=(text,lang,1,options={spd:Speed ​​of speech,Get the value0-9,Default is5中Speed ​​of speech,
pit:tone,Get the value0-9,Default is5Chinese tone,
vol:volume,Get the value0-15,Default is5中volume,
per:Pronunciation person selection, 0For female voice,1For male voice, 3For emotional synthesis-Du Xiaoyao,4For emotional synthesis-Du Yaya,Default is普通女})  
#Configure personalized voicewith open('XX.mp3','wb') as f:  #Open file stream    (res)    #Write to file

3. pywin32

The library that operates window dll can implement many functions and is very powerful. However, after testing, it is not very friendly to Chinese support.

Need to install first: pip install pywin32

# -*- encoding: utf-8 -*-
from win32com import client

# Configure client interfacespeaker = ("")

("hello")

4. speech

It is also a powerful voice module, which relies on pywin32, and it is most suitable for voice startup programs.

Download and import: pip install speech

import speech
# Generate audio:('hello')

This is the article about python using gTTS to implement text to voice function. For more related python gTTS text to voice content, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!