Set two random variables The joint distribution of is , the edge distribution is , mutual information Is joint distribution And edge distribution Relative entropy of, [2] I.e
H (X), H (Y), I (X, Y) and other diagrams Mutual information nature
For any random variable , , their mutual information Meet: - one
- two
Positive semi definite: , if and only if , independent,
The average amount of mutual information does not start from two specific messages, but from the overall perspective of random variables X and Y, and observes problems in an average sense, so the average amount of mutual information will not appear negative. In other words, extracting information about another event from one event, the worst case is 0, which will not increase the uncertainty of another event because of knowing one event.
If To form a horse chain, then other
The traditional mutual information definitions of a word t and a category Ci are as follows:
Mutual information is Computational linguistics A common method of model analysis, which measures the reciprocity between two objects. Used in filtering problems to measure the discriminative power 。 The Definition of Mutual Information and Cross Entropy Approximation [2] 。 Mutual information was originally information theory A concept in, used to express the relationship between information, is two random variables Statistical correlation The mutual information theory is used for feature extraction based on the following assumption: terms with high frequency in a specific category, but low frequency in other categories, have more mutual information with this category. Usually, mutual information is used as the measure between the feature words and categories. If the feature words belong to this category, their mutual information is the largest. Since this method does not need to make any assumptions about the nature of the relationship between feature words and categories, it is very suitable for Text classification Of the characteristics and categories of Registration work [2] 。 Mutual information and pluralism logarithm likelihood ratio Inspection and Pearson Verification is closely related [3] 。 Information is Marking of substance, energy, information and their attributes 。 Inverse Wiener Information Definition
Information is Increase in certainty 。 Inverse Shannon Information Definition
Information is Phenomenon of Things and Their Attribute Identification Collection of.
Meaning of mutual information
information theory Mutual information in Generally speaking, there is always noise and interference in the channel. The source sends the message x, and after passing through the channel, the receiver may only receive some deformed y caused by interference. After the destination receives y, it speculates that the source sends x probability , this process can be Posterior probability P (x | y). Accordingly, the probability p (x) of the source sending x is called Prior probability 。 We define a posteriori of x probability Ratio to prior probability logarithm Is the mutual information quantity of y to x (mutual information for short) [4] 。 According to the chain rule of entropy, there are
Therefore,
This difference is called mutual information of X and Y and recorded as I (X; Y).
Expanding according to the definition of entropy, we can get: