It provides high-quality long speech recognition services in the industry. In conferences, teaching or media interviews, Baidu's long speech recognition services can quickly and accurately convert long time speech into text, facilitate subsequent work such as copying and editing, and make your work and life more convenient
Application scenarios
minutes of the meeting
Convert the audio of the meeting into text information, facilitate the later word processing and content archiving, and save the labor and time costs of meeting minutes
Real time subtitles
Real time subtitle conversion of live broadcast, video, live speech and other audio to reduce understanding costs and improve user experience
Voice Notes
Notes, summaries, etc. can be translated into words by voice for recording, greatly improving the user's input efficiency
Technical features
User defined upload recognition thesaurus
Developers can upload their own thesaurus and train their own recognition model. The more and more corpus submitted, the more obvious the improvement of speech recognition effect will be
Deep semantic analysis
It supports semantic understanding of up to 35 vertical fields, such as transportation, social networking, entertainment and other fields.It also supports the setting of custom instruction sets and question and answer pairs