Technical capability
Voice technology
Image technology
Character recognition
Face and Human Recognition
video technique
AR and VR
natural language processing
knowledge graph
Data Intelligence
Industry capability
Scenario
Deployment scheme
Industrial application
Intelligent education
Intelligent medical treatment
Smart retail
Intelligent industry
Enterprise services
Intelligent government
Intelligent agriculture
information service
Smart Park
Intelligent hardware
Experience AI
Open Baidu APP
"Sweep" experience immediately
Long speech recognition
Call the service through the SDK to convert long speech (long continuous speech) into text
Function introduction
It provides high-quality long speech recognition services in the industry. In conferences, teaching or media interviews, Baidu's long speech recognition services can quickly and accurately convert long time speech into text, facilitate subsequent work such as copying and editing, and make your work and life more convenient
Application scenarios
minutes of the meeting
Convert the audio of the meeting into text information, facilitate the later word processing and content archiving, and save the labor and time costs of meeting minutes
Real time subtitles
Real time subtitle conversion of live broadcast, video, live speech and other audio to reduce understanding costs and improve user experience
Voice Notes
Notes, summaries, etc. can be translated into words by voice for recording, greatly improving the user's input efficiency
Technical features
User defined upload recognition thesaurus
Developers can upload their own thesaurus and train their own recognition model. The more and more corpus submitted, the more obvious the improvement of speech recognition effect will be
Deep semantic analysis
It supports semantic understanding of up to 35 vertical fields, such as transportation, social networking, entertainment and other fields. It also supports the setting of custom instruction sets and question and answer pairs