Chinese name: Interphonic 5.0 speech synthesis system of iFLYTEK
English name: Interphonic 5.0
edition: five
Production and distribution: IFLYTEK
Language: Simplified Chinese
Sharing address: Bitter Melon Sweet Blog www.kuguagantian.com

System Introduction

InterPhonic series speech synthesis system is a Chinese English mixed reading speech synthesis system launched by iFLYTEK. Its main function is to provide Chinese English mixed text continuous speech synthesis service, provide development interface for calling speech synthesis service, and other features related to speech synthesis.

Technical characteristics

·Original intelligent text pre-processing technology
·Original statistical model of corpus information;
·The design method of the front and rear consistent corpus, and the automatic construction method of the corpus;
·High precision prosody model based on variable length prosody template under the guidance of auditory quantification;
·Intelligent text analysis and processing technology with high robustness;
·Corpus pruning technology based on the minimum hearing loss;
·A multilingual speech synthesis system framework that separates specific language knowledge from system modeling methods;
·Customized speech synthesis technology for specific applications.

Functional characteristics

1. High quality voice - real-time conversion of input text into smooth, clear, natural and expressive voice data;
2. Multilingual services - integrated multilingual voice synthesis engine, which can provide voice synthesis services in Chinese, mixed Chinese and English, English and Cantonese;
3. High precision text analysis technology - it ensures the intelligent analysis and processing of unknown words (such as place names), polyphonic characters, special symbols (such as punctuation, numbers), prosodic phrases, etc. in the text;
4. Multi character set support - support the input of multiple character sets such as GB2312, GBK, Big5, Unicode and UTF-8, as well as common text and text information with CSSML tags and other formats;
5. Multiple data output formats - support the output of voice data in various adoption rate linear Wav, A/U rate Wav, Vox and other formats;
6. Flexible interface - standard interface, simple interface, COM interface and SAPI interface are provided to facilitate system integration in multiple environments;
7. Voice adjustment function - the development interface provides the dynamic adjustment function of multiple synthesis parameters such as volume, speed, pitch, etc;
8. Configuration and management tools - The synthesis engine provides tools for unified configuration and management, and completes global parameter configuration, user dictionary, user rules, customized resource package management and other functions;
9. Effect optimization - the synthesis engine provides a variety of methods to optimize the synthesis effect for the actual application environment, represented by customized resource packages and CSSML;
10. Consistent access mode - can access remote voice synthesis services in Client/Server mode, and provide the same development interface as local calls, realizing completely transparent access;
11. Dynamic load balancing - provides a dynamic load balancing module to dynamically allocate resources of multiple voice synthesis servers in a transparent way to users;
12. Background sound and pre recording - The synthesis system also provides background sound and pre recording functions to meet the application and personalized needs of users in different occasions.

Index of live voice database

Phonetic library number pronunciation style support language support sampling rate
1. Small, quiet, middle-aged female voice, with gentle tone, gentle and calm style, 6K/8k/11k/16k in Chinese and English
2 Xiaoyan's young female voice, with clear and crisp tone, easy and lively style, 6K/8k/11k/16k in Chinese and English
3 Xiaomei young female voice, with clear and crisp tone, friendly and pleasant Cantonese and Cantonese English mixed reading 6K/8k/11k/16k
4 Xiaoyu's middle-aged male voice, with honest voice quality, calm and soft style, mixed Chinese and English, and pure English 6K/8k/11k/16k
5 Sherri young female voice, with smooth tone and gentle and smooth style, 6K/8k/11k/16k English
6 Xiaoqian's young female voice, sweet in tone, lively in style, 6K/8k/11k/16k in Chinese and English
7 Xiaolin's young female voice, with clear and crisp tone, is friendly and pleasant in Taiwanese and Chinese English 6K/8k/11k/16k

This software should be regarded as high-tech. It turns text into speech, and the synthetic sound quality can be comparable to real person reading, basically achieving the effect of an announcer. There are many kinds of voice software, and those lightweight and small voice software are generally computer synthesized voice or network read voice database. However, this software has a variety of 16K high-quality voice databases, so the size is 7G. This version is packaged into a green portable version with program virtualization technology. It does not need green installation and can run directly in mobile hard disk and USB flash disk. The cracking program has been integrated, and the live voice databases of Xiaoyan, Xiaomei, Xiaoyu, Sherri, Xiaoqian and Xiaolin have been integrated and installed. The Xiaojing voice database has not been seen on the Internet, so it is not integrated. Generally, Xiaoyan and Xiaoyu voice databases are commonly used, which are also the two voice databases with the best sound quality and sound effect. This version perfectly solves the problem that the original program is difficult to run normally under WIN7, WIN8, 32-bit and 64 bit systems. However, the problem that the CSSML editor is in a system above WIN7 and the audio device cannot be opened (normal under XP) cannot be solved temporarily. The installation and cracking steps of the original version of the IFLYTEK voice synthesis system at the University of Science and Technology of China are tedious, and there is a special installation tutorial on the Internet. This software used to have the Yunlong green version, but it does not support the WIN7 system very well, and there are few voice databases. The purpose of making this green portable version is to simplify the complexity, extend the life of this software, and run normally on the new system.

Features of this edition

By @liziwen

The original green portable version has been released for seven years, and it can't run normally under win10. This time, it is remade to support win10 perfectly.

Illustrations of iFLYTEK's voice synthesis system V5.0

——The content of this page has ended. Thank you for your support. Please indicate the source for reprinting——

one	two	three	four	five	six	day
	one	two	three	four	five	six
seven	eight	nine	ten	eleven	twelve	thirteen
fourteen	fifteen	sixteen	seventeen	eighteen	nineteen	twenty
twenty-one	twenty-two	twenty-three	twenty-four	twenty-five	twenty-six	twenty-seven
twenty-eight	twenty-nine	thirty

IFLYTEK speech synthesis system V5.0 Green Portable Optimized Edition

System Introduction

Technical characteristics

Functional characteristics

Features of this edition

Balsam pear is sweet

Related recommendations

comment Grab the sofa

You must log in before commenting!

Local weather

two zero two five year - new year inverted meter Hour

Huisheng Huiying 2023 Omnipotent Optimization Master

The whole network first launches the cleaning tool for the 2023 conference

Icon all-around tool IconLover V5.48

Screenshot editing FastStoneCapture

Hot tags

Friendly Links

Website statistics

Reward the author of the article if you think it is useful

Thank you very much for your reward, we will continue to give more high-quality content, let's create a better online world together!

Scan Alipay and reward

Scan WeChat and reward