Variance is measured in probability theory and statistical variancerandom variableOr a group of data.Variance is used to measure in probability theoryrandom variableAnd itsMathematical expectation(i.emean value)The degree of deviation between.The variance in statistics (sample variance) is the square of the difference between each sample value and the average of all sample valuesaverage。In many practical problems, it is of great significance to study variance, that is, deviation degree.
Variance is a measure of the difference between the source data and the expected value.
The word "variance" was first used byRonald Fisher (Ronald Fisher)paper《The Correlation Between Relatives on the Supposition of Mendelian Inheritance》[1]Proposed in.
definition
Announce
edit
Variance has different definitions and formulas in statistical description and probability distribution.
In statistical description, variance is used to calculate the difference between each variable (observation value) and the overall mean.In order to avoid that the total deviation from the mean is zero, and the sum of squares of the deviation from the mean is affected by the sample size, statistics uses the sum of squares of the average deviation from the mean to describe the variation degree of variables.populationVariance calculation formula:
Is the population variance,
Is a variable,
Is the overall mean,
Is the total number of cases.
In practical work, when the overall mean is difficult to obtain, the sample statistics shall be used to replace the overall parameters. After correction, the sample variance calculation formula is as follows:
[2]
Is the sample variance,
Is a variable,
Is the sample mean,
Is the number of samples.
In the probability distribution, let
Is a discrete typerandom variable, if E ((X-E (X))two)If it exists, it is called E ((X-E (X))two)For
Variance of, recorded as
,
or
, where
yes
The expected value of,
Is a variable value[1], in the formula
Is the abbreviation of expected value, which means the expected value of "the square of the difference between the value of a random variable and its expected value".[2]Discrete random variableVariance calculation formula:
When
Called variable
The variance of, and
It is called standard deviation (orMean square error)。It is related to
They have the same dimensions.Standard deviation is used to measure the dispersion of a group of datastatistic[3]。
, X variance calculation formula of continuous random variable[2]:
Variance describes how the value of a random variable affects its mathematical expectationDegree of dispersion。(The greater the standard deviation and variance, the greater the degree of dispersion)
If the value of X is relatively centralized, then the variance
Smaller, if the value of X is scattered, then the variance
Larger.
Therefore,
It's a portrayal
Value dispersion is a measure of value dispersion.
When it is continuous, X can be taken as a constant at any finite point
The value of.
5、
。
prove
1、
2、
3、
The third item at the right end of the above formula is.
If X and Y are independent of each otherMathematical expectationWe know that the above formula is 0.
4. Adequacy:
, there is
Necessity: The probability will not be greater than 1 by using the method of contradiction, and only need to consider whether it is equal to or less than 1.
Finding the mathematical expectation&variance of normal distribution
set up
, seek
,
.
order
, due to
, so
, known
,
, thus
Example
Announce
edit
It is known that the true length of a part is a, and now the two instruments A and B are used to measure 10 times respectively. The measurement result X is represented by a point on the coordinate as shown in Figure 1:
Measurement results of instrument A:
Measurement results of instrument B: all a
The mean value of the measurement results of both instruments is a.But if we use the above results to evaluate the advantages and disadvantages of the two instruments, it is obvious that we will think that the performance of instrument B is better, because the measurement results of instrument B are concentrated around the mean value.
Thus, it is necessary to study the deviation degree between random variables and their mean values.So, how to measure the degree of deviation?It is easy to see that E [| X-E [X] |] can measure the deviation degree of random variables from their mean E (X).However, because the above formula has an absolute value, the calculation is not convenient. Usually, the amount is E [(X-E [X])two]This number is characterized by variance.
Figure 1 Measurement Results
formula
Announce
edit
Variance is the difference between actual value andexpected valueDifferencesquareOfaverage value, andstandard deviationIs variance arithmeticsquare root。[5]In actual calculation, we use the following formula to calculate the variance.
Variance is the difference between each data andaverageThe average of the sum of the squares of the difference, that is
, where x representssampleThe average number of, n is the number of samples, xiRepresents an individual, while stwoIs the variance.
And used
As the estimation of the variance of sample X, it is found that its mathematical expectation is not the variance of X, but the variance of X
Times,
The mathematical expectation of is the variance of X, which is used as the estimate of the variance of X“Unbiasedness”, so we always use
Variance is the degree of deviation from the center, which is used to measure the fluctuation of a batch of data (that is, the deviation of the batch of data from the average). It is called the variance of this group of data and recorded as Stwo。staysample sizeIn the same case, the larger the variance, the greater the volatility of the data, and the more unstable it is.
The formula can be further deduced as:
。Where x is the data in this group of data, and n is an integer greater than 0.
variance
statistical significance
Announce
edit
When the data distribution is relatively scattered (that is, the data fluctuates greatly near the average), the square sum of the differences between each data and the average is larger, and the variance is larger;When the data distribution is relatively centralized, the sum of squares of the difference between each data and the average is small.Therefore, the greater the variance, the greater the volatility of the data;The smaller the variance, the smaller the fluctuation of the data.[6]
The data in the sample is compared withAverage number of samplesThe average of the sum of squares of the difference between is called the sample variance;Sample variancearithmetic square root It's called samplestandard deviation。Both sample variance and sample standard deviation measure the fluctuation of a sample. The larger the sample variance or sample standard deviation, the greater the fluctuation of sample data.
Variance and standard deviation are the most important and commonly used indicators for measuring discrete trends.Variance is the difference between the value of each variable andmean valueDeviationThe average of squares, which is a measureNumerical dataDegree of dispersionThe most important method.standard deviationIs the arithmetic square root of variance, represented by S.The corresponding calculation formula of variance is:
The difference between standard deviation and variance is that the calculation unit of standard deviation and variable is the same, which is clearer than the variance. Therefore, we often use standard deviation in our analysis.
Recent developments
Announce
edit
Variance not only expresses the degree of deviation from the mean of the sample, but also reveals the degree of mutual fluctuation within the sample. It can also be understood that variance represents the expectation of mutual fluctuation of the sample.Of course, this conclusion holds under the second-order statistical moment.[7]