Long speech recognition - Baidu AI open platform

Cooperation consulting

Pre sales consultation

Fill in your business needs, and the exclusive account manager will contact you as soon as possible to provide one-on-one consulting services

After sales intelligent assistant

Intelligent diagnosis to quickly solve the use problem

For more information, please call 400-920-8999 to 1

Experience AI

Open Baidu APP

"Sweep" experience immediately

Function introduction

It provides high-quality long speech recognition services in the industry. In conferences, teaching or media interviews, Baidu's long speech recognition services can quickly and accurately convert long time speech into text, facilitate subsequent work such as copying and editing, and make your work and life more convenient

Application scenarios

minutes of the meeting

Convert the audio of the meeting into text information, facilitate the later word processing and content archiving, and save the labor and time costs of meeting minutes

Real time subtitles

Real time subtitle conversion of live broadcast, video, live speech and other audio to reduce understanding costs and improve user experience

Voice Notes

Notes, summaries, etc. can be translated into words by voice for recording, greatly improving the user's input efficiency

Technical features

User defined upload recognition thesaurus

Developers can upload their own thesaurus and train their own recognition model. The more and more corpus submitted, the more obvious the improvement of speech recognition effect will be

Deep semantic analysis

It supports semantic understanding of up to 35 vertical fields, such as transportation, social networking, entertainment and other fields. It also supports the setting of custom instruction sets and question and answer pairs

common problem:

What is the daily call limit of voice recognition and synthesis interface, and how to apply for increasing the limit?

QQ Support Group

Baidu Voice: five hundred and eighty-eight million three hundred and sixty-nine thousand two hundred and thirty-six

Text recognition: one billion fifty-five million six hundred and twenty-three thousand eight hundred and twenty-seven

Custom template OCR: one billion fifty-five million four hundred and two thousand seven hundred and twenty-one

Face recognition: six hundred and ninety-two million four hundred and fifty thousand eight hundred and fifty-two

Human body analysis: eight hundred and sixty million three hundred and thirty-seven thousand eight hundred and forty-eight

Image audit: three hundred and seventy-five million seven hundred and sixty-five thousand one hundred and ninety-four

Image recognition: three hundred and twelve million one hundred and fifty-six thousand seven hundred and eighty-two

EasyDL： six hundred and fourteen million nine hundred and fifty-one thousand two hundred and thirty-nine

Image search: one billion sixty-seven million two hundred and seventy-six thousand one hundred and fifty-four

Video analysis: six hundred and thirty-two million four hundred and seventy-three thousand one hundred and fifty-eight

Baidu AR: four hundred and seventy-two million eighty-one thousand one hundred and nineteen

Natural language: one billion fifty-one million four hundred and thirty-six thousand five hundred and fourteen

Text review: nine hundred and eighty-three million two hundred and fifty-nine thousand six hundred and seven

UNIT： one billion seventy-four million four hundred and ten thousand one hundred and eighty-nine

Baidu Translate: two hundred and fourteen million eight hundred and fifty-seven thousand seven hundred and six

Image effect enhancement: one billion ninety-two million three hundred and thirty-eight thousand eight hundred and twenty-nine

Data intelligence: six hundred and fifty million five hundred and ninety-six thousand eight hundred and twenty-nine

Knowledge map: six hundred and fifty-five million eight hundred and fifty-four thousand seven hundred and eighty-six

DuerOS： six hundred and four million five hundred and ninety-two thousand and twenty-three

Baidu AI open platform: two hundred and twenty-four million nine hundred and ninety-four thousand three hundred and forty

Intelligent writing: six hundred and ninety-seven million thirty-four thousand eight hundred and twenty-three

EdgeBoard： one billion sixty million six hundred and twenty-three thousand three hundred and fifty-two

Voice self training platform: six hundred and eighty-six million two hundred and sixty-seven thousand five hundred and twenty-one

Far field voice development kit: two hundred and ten million ninety-three thousand two hundred and four

×