Core demands
The review of Fangtianxia Real Estate covers 65000 new housing projects in 658 cities across the country. The daily average review of buyers' interaction exceeds 500000. In the face of massive content data, it used to require more manpower to review and identify high-quality real content. Generally, each city needs at least one person to maintain. How to complete content review more efficiently with technical means, It is an important demand of Fangtianxia. In addition, Fangtianxia has always been committed to the content analysis of real estate reviews, but the previous analysis is based on the keywords manually split by the operator. The number of keywords is small, the description is single and cannot be dynamically updated, which can only meet the basic classification of review content.
Solution
With regard to the content review of property reviews, Fangtianxia has made every effort to promote the technical review mode. The early stage mainly includes: automatic weight removal to effectively prevent the generation of similar content; Keyword filtering: content with illegal words is automatically filtered and deleted; OCR image filtering, illegal images are automatically filtered and deleted; On this basis, Fangtianxia introduced Baidu Natural Language Emotional Tendency Analysis Technology to automatically identify and refine high-quality content. When automatically refining and classifying, it can be differentiated according to the classification of emotional polarity.
After the introduction of Baidu natural language processing technology, Fangtianxia has formed a set of comment tags for each real estate, which has realized the dynamic updating of tag words according to the real estate, and visually displayed the user reputation of the real estate to buyers and developers.
Establishment of comment tags: Baidu comment opinion extraction technology is used to extract "short tags" and "long tags" from each comment data according to the real estate. The extracted comment tags are clustered and merged through short text similarity technology. During the process, emotional orientation analysis is also conducted to obtain the emotional polarity of comments, and finally tag keywords are displayed on the front page according to the weight.