Zhonggong Entertainment

How can AI assist the study of ancient characters (reading aloud with voice)

Source: Guangming Daily
2022-11-04 09:04

Original title: How Artificial Intelligence Assists Ancient Character Research

Author: Li Chuntao (Professor of the Institute of Ancient Books, School of Archaeology, Jilin University, and project leader of the "research and development and construction of software system for artificial intelligence recognition of ancient characters")

AI has been involved in people's life, work, learning and many other aspects, as well as scientific research. Although the specialty of ancient characters is a traditional discipline, it is very consistent with AI technology. Compared with many majors in humanities, the textual research on ancient characters is more objective, its research conclusion is unique, and the research process also follows certain laws. In humanities, it is closest to natural science, which is consistent with the working principle of artificial intelligence technology.

So, which technologies of artificial intelligence can assist the study of ancient characters? What work needs manpower to complete before? To what extent has the academic community achieved so far? This article will talk about these contents.

As far as the interpretation of ancient characters is concerned, AI can provide help in many ways. Tang Lan, a famous ancient philologist, once summarized the methods of textual research and interpretation of ancient characters into body comparison, extrapolation, paragon analysis and historical textual research. Image recognition, natural language processing, knowledge mapping and other technologies in artificial intelligence can correspond to these methods.

The form comparison method is to compare and contrast the ancient characters in different materials, and use the learned characters to verify the undefined forms. This method requires researchers to memorize a large number of ancient characters, so as to link different materials. Artificial intelligence recognition uses deep learning and computer vision technologies and algorithms. As long as enough ancient Chinese characters are provided to train models, the recognition function can be realized. In terms of memory, the human brain can only remember a small part of the total number of millions of characters and shapes, while the intelligent model can fully cover and have more powerful capabilities. The recognition and recommendation results will give experts effective tips.

The extrapolation method is to compare the unearthed materials with the records in the literature, to find the meaning of the text, and then to decipher the unexplained form. With the use of deep learning technology, especially the large-scale use of the BERT pre training model developed by Google in 2018 in the field of natural language processing, natural language information processing technology has developed rapidly in recent years, with significant improvements in named entity recognition, semantic relations and other aspects. After sufficient data training, the model can have the ability of ordinary people and even experts. To take an easy to understand example, suppose that the "△ ▽" two forms in the sentence "We cooked △ ▽ to eat after the festival" are unreleased words or damaged words. If these two figures are to be verified, the trained deep learning model can give alternative answers, such as "dumpling", "tangyuan", "zongzi" and other words that match. Because of the "holiday" restrictions, "porridge" and other ordinary boiled food will not be recommended; Because of the limitation of the word "boiled", "moon cake" and other non boiled food will not be recommended. The deep learning model can fully capture the meaning of key words in sentences. For this sentence in the example, ordinary people can also give the correct direction of judgment. But if we are faced with ancient written materials, the situation is different, because most people are not familiar with ancient Chinese, and even professional scholars cannot memorize a large number of ancient language materials. Therefore, we can use the interpretation data of unearthed documents and the records of ancient books to train the language model, so that we can use the model to delineate the target range of characters to be interpreted in the research process, and sometimes even lock the correct answer, which can give experts great help.

The method of radical analysis is to study and interpret ancient characters by analyzing and identifying radical. The method of historical textual research is to study and interpret ancient characters according to the characteristics and evolution laws of shapes in different periods. AI knowledge mapping technology is related to these two methods. The knowledge map is an intelligent network that depicts the relationship between entities, and can integrate the knowledge of ancient characters at several levels, such as components, glyphs, and word meanings. The atlas can be linked according to the character side, so as to show those characters with the same side and their corresponding forms; The knowledge map also has the potential to tap the law of text evolution, and then provide help for experts. It can be seen that in the process of ancient characters research, AI technology can provide assistance for experts from multiple dimensions.

Of course, with the publication of bamboo slips in the Warring States Period, the method of textual research and interpretation of ancient characters has also changed. It is particularly important to find the "word" represented by characters by breaking the relationship between counterfeit and counterfeit. This requires experts to do a good job of marking the phenomenon of fake, and make the model have a sense of fake language after repeated training. However, the model is powerless in the face of the use cases that have never appeared before. Therefore, phonologists need to intervene and study from the perspective of the rules of fraud, so that the model can grasp the "instances" and "rules" required by the conditions of fraud at the same time.

The author believes that the combination of AI and ancient characters can be divided into three stages: the first stage is artificial modeling. Ancient characters experts need to sort out basic data, including data interpretation, plate cutting, font splitting, attribute labeling, etc; Computer experts use these data to complete the function realization. This stage is time-consuming, laborious and the hardest. The second stage is that AI provides assistance to experts by using realized technologies. This may be a time-saving and labor-saving data survey, a positive guide to the direction of thinking, or an intelligent recommendation of research results. At this stage, experts will also improve the deficiencies of intelligent technology. The third stage is the independent judgment of artificial intelligence, which can synthesize all the above aspects to give the answer to the question. The current research in this field seems to be in the first stage.

The above focuses on the relationship between artificial intelligence and the textual research of ancient characters. In fact, AI can also help researchers in other aspects, such as oracle bone conjugation, grouping and classification of oracle bone inscriptions, dating of bronzes, bamboo slips, etc. It can be predicted that in the future, AI technology will provide help for the study of ancient characters in more aspects. It is often said that paleography is an ancient and young learning. Due to the intervention of artificial intelligence technology, the "youth" of ancient philology is also reflected in research methods, which can be combined with the latest technology. It is believed that new research materials and new research methods can make ancient philology young and full of vitality.

Editor in charge: Wang Yan

Media Matrix


  • China Industrial Network Client

  • Microsignal of China Industrial Network

  • Zhonggong Weibo account

  • CAMCE voice signal

China Industrial Network Client

The online home of hundreds of millions of employees

Experience now

About us | Copyright Notice |Illegal and bad information report hotline: 010-84151598 | Online extortion and paid deletion report hotline: 010-84151598
Copyright © 2008-2024 by www.workercn.cn. all rights reserved

Scan code for attention

CAMCE WeChat


CAMCE Weibo


CAMCE Treble


Workers' Daily
client
×