Customer Joy · Intelligent customer service voice engine

The intelligent customer service voice engine product of Baidu Smart Cloud Customer Joy adopts Baidu's original acoustic and language integration end-to-end modeling technology to provide voice recognition and online voice synthesis capabilities and solutions based on call center, intelligent customer service and other scenarios, which can be used in intelligent outbound calls, voice IVR, voice robots, customer service dialogue assistance, voice quality inspection and other scenarios, Helping enterprises access call center voice more efficiently and quickly

Demand customization of intelligent customer service solutions Consult immediately

  • Introduction to Speech Recognition
  • Introduction to speech synthesis
  • Use Scenarios
  • Customer Stories
  • Related products
  • Consult immediately

Introduction to Speech Recognition

Leading technology

Using a streaming, multi-level, truncated attention model SMLTA is the same as Baidu Search and Xiaodu speakers.

High precision

Directly model the voice to text mapping to improve the recognition rate of whole sentences and Chinese English mixture.

High performance

High availability transmission speech recognition protocol, realizing full duplex streaming interaction.

Introduction to speech synthesis

Leading technology

With MELRNN+SUBRNN, the algorithm is stable. Fast inference speed, saving system resources

Smooth simulacrum

Using the autoregressive model, the generated voice has High degree of reduction and naturalness, and controllable syllable duration.

Rich sound library

A variety of audio libraries are available to meet various application scenarios of the call center.

Application scenarios

Human computer interaction direction
Dialogue analysis direction

Human computer interaction direction

Meet the voice technology requirements of various scenarios such as intelligent voice customer service, intelligent IVR, intelligent outbound call, human-machine interaction, etc.

We can provide

  • Full duplex streaming interaction: the streaming transmission mode with low delay and high concurrency is adopted, and the unified model is adopted, without distinction between real-time and offline, to achieve full duplex streaming interaction.
  • Voice model training platform: the language model training platform can be privately deployed, and customers can customize and optimize the model according to hot words and proper nouns to improve the accuracy of words.
 Human computer interaction direction

Customer Stories

Related products

Customized demand

If you have project cooperation intention or customized demand, please feel free to contact us

Consult immediately