-
Import the re module: Before using regular expressions, you need to import the re module. -
Use Chinese character set: In regular expressions, Chinese character set can be used to match Chinese characters. The Chinese character set is represented by [ u4E00 -- u9FA5], where u4E00 represents the Unicode code of the first Chinese character "one", and u9FA5 represents the Unicode code of the last Chinese character "".
import re pattern = re. compile ( r'[\u4e00-\u9fa5]+' ) text = 'Hello World! Hello, the world! ' result = pattern.findall(text) print (result) #['Hello ',' World ']