Baidu data crowdsourcing platform

Text categorization

In the era of big data, how to make your big data into effective data is more important. A large number of product evaluation information, search keywords, and entity words need to be de redundant and classified. Using this scenario, you can quickly create text categorization tasks. Click Preview to view the annotation question types provided by this scenario.
For example, if you have a large number of product evaluation information waiting to be processed, and you need to filter out the information with clear favorable comments and clear negative comments for analysis, you can configure multiple options for each evaluation information in this scenario (such as clear negative comments, clear favorable comments, neutral, advertising), and finally your original data will be classified and labeled according to the options.

preview Use Now

Text Extraction

If you need to extract the text that meets the requirements from the web page, article, text collection, or even write new text (such as web page summary), you can choose this scenario.
For example, if you now have a large number of web page links, you need to extract the content keywords of three web pages from the information displayed on the web page. In this scenario, you can configure a web page display area and multiple text input boxes. The data you eventually recycle is all web page links and their one-to-one corresponding keyword sets.

preview Use Now

Text correction

In text processing, manual correction is always more accurate. Especially for Chinese, a language with changeable grammar and complicated word meaning, manual correction is certainly an essential step. Select this scenario to create tasks that can complete the correction of text morphology and accurate word use.
If you are doing natural language processing and some machine analyzed text morphology needs manual verification, you can use this scenario to clearly describe your lexical rules, display your original data in a specific style, and the crowdsourcing platform will return the corrected text for you.

preview Use Now

Picture classification

Similar to text, a large amount of image data also has the needs of classification processing, effective image filtering, and so on. The scene is designed with very simple picture screening questions, which makes the image classification processing more efficient and fast. You only need to provide picture links and filter criteria to quickly create publishing tasks.
If you are an e-commerce platform developer and have a large number of images to be classified according to keywords, this scenario provides a very efficient filtering method. You only need to provide a collection of image links and corresponding category keywords to be filtered, and you can quickly recycle the classified image data.

preview Use Now

Picture Tagging

It is a need to label pictures by marking taste and cooking method information for food pictures, and by marking decor, style and other information for clothing pictures. Picture labeling scenarios can well meet this need.
If your e-commerce platform has a large amount of image data and needs multi-dimensional labeling, this scenario will meet your needs very well.

preview Use Now

Picture content tag

In Baidu Unmanned Vehicle Traffic Element Recognition and Picture Character Recognition, a large amount of original recognition data is required. The scene well supports the identification and annotation of entity content in the image, and provides a simple box selection annotation tool to make data recovery more efficient and accurate.
For you who are doing image recognition, this scene will quickly provide you with fairly accurate frame annotation data, such as face frame annotation and text box annotation in the picture.

preview Use Now

Picture acquisition

Image collection of specific objects and entities, and image information collection based on geographical location, can be realized through this scene, whether it requires a large amount of image data or image data collected under specific conditions.
If you are developing LBS related services and need a large number of pictures based on geographical location information, such as shopping malls, stores, scenic spots and other pictures, this scene is the best choice.

preview Use Now

Voice content extraction

The machine can not recognize the content of speech completely and accurately, but it can be done manually, and to optimize speech recognition also requires manual intervention. This scenario can achieve high-quality audio content textualization through manual escape.
If you are developing speech recognition and have a large number of corpora to be textualized, this scenario will provide you with manual escape without language restrictions, which can largely guarantee the correctness of the escape.

preview Use Now

Voice Filtering

If your machine learning already has audio recognition but needs verification results, or the audio data you collect needs a round of cleaning to improve the collection quality, you can use voice filtering scenarios to achieve.
If you have collected a large amount of voice with specified content, you can filter the validity of the voice file through this scenario (such as whether the content is clear and whether it is the voice with specified content), or filter the attributes of the voice provider (such as male voice, female voice, child, elderly, etc.).

preview Use Now

Corpus collection

Different from picture acquisition, voice acquisition is more difficult, especially for special content and language data acquisition. These problems can be solved on the crowdsourcing platform. You only need to define the acquisition requirements, and our voice acquisition scenarios will be able to quickly support.
If your voice technology research and development requires a large number of voices with specific conditions (such as female voice reading specific content) or ordinary voices, this scenario can fully meet your needs.

preview Use Now

Effective page filtering

This scenario supports embedding iframe. You can publish the task of web page effectiveness filtering by providing web page links and related queries. The types that can be processed in this scenario include the judgment of search result relevance, the judgment of the matching degree between web page content and query, and the comparison and selection of web page search content.
If you are a search engine developer, you need to manually judge and analyze the matching degree between your product search results and query. In this scenario, embed the search page display and the corresponding matching degree options (such as: very consistent, generally consistent, not very consistent, etc.), and the announcer will classify according to the judgment rules you set.

preview Use Now