sampling survey

[chōu yàng diào chá]
Mathematical concept
open 2 entries with the same name
zero Useful+1
Sampling survey is a kind of non comprehensive survey, which is conducted by selecting some units from all the survey objects investigation A survey method based on which all survey subjects are estimated and inferred. Obviously, although the sampling survey is not a comprehensive survey, its purpose is to obtain information that reflects the overall situation. Therefore, it can also play the role of a comprehensive survey. By selection sample Sampling survey can be divided into probability sampling and Non probabilistic sampling Two types. Probability sampling is based on Probability and Mathematical Statistics The principle of Random principle To select samples, and to estimate and infer some characteristics of the population in terms of quantity, and to control the possible error in the sense of probability. Probability sampling is customarily called sampling survey.
Chinese name
sampling survey
Foreign name
sampling survey
An incomplete investigation
Applied discipline
mathematics statistics

Brief introduction of sampling survey

Sampling survey is a kind of Incomplete investigation Sampling survey is a statistical analysis method that extracts part of the actual data from the population according to the principle of randomness, and uses the probability estimation method to calculate the corresponding quantitative indicators of the population according to the sample data. Although the sampling survey is not a comprehensive survey, its purpose is to obtain feedback population Information about the situation, so it can also play a role in Comprehensive investigation Role of.
According to the sampling method, the sampling survey can be divided into Probabilistic sampling and Non probabilistic sampling Two types. Probability sampling is based on probability theory and mathematical statistics The principle of is to select samples from the population investigated and studied according to the random principle, estimate and infer some characteristics of the population from the quantity, and control the possible errors in the inference from the probability sense. In China, probability sampling is traditionally called sampling survey. [1]


Sampling survey selects some individuals from the population of the research object as sample According to the survey, it is concluded that the overall digital characteristics are economical, effective, adaptable and accurate.
Sampling survey is a kind of statistical survey method that infers the total amount of population markers according to some actual survey results, which belongs to the category of non comprehensive survey. It is based on scientific principles and calculations, taking part of the sample units from the total of things composed of several units to investigate and observe, and using the data of the survey signs obtained to represent the total and infer the total.
Like other surveys, sampling surveys also encounter survey errors and biases. Generally, there are two kinds of errors in sampling survey: one is working error (also called registration error or survey error), and the other is Representative error (also called sampling error). However, sampling survey can control the representative error within the allowable range through sampling design, calculation and a series of scientific methods; In addition, due to the small number of investigation units, strong representativeness and fewer investigators, the work error is smaller than that of the comprehensive investigation. Especially when there are many survey units included in the overall survey, the accuracy of the sample survey results is generally higher than that of the comprehensive survey. Therefore, the results of the sample survey are very reliable.
The sampling survey data can be used to represent and calculate the population, mainly because the sampling survey itself has characteristics that other non comprehensive surveys do not have, mainly:
(1) The survey sample is based on random The opportunity for each unit to be selected in the population is equal. Therefore, it can ensure that the selected units are evenly distributed in the population, without bias errors, and is highly representative.
(2) All sample units are taken as a "delegation", and the whole "delegation" is used to represent the population. Instead of using individual units selected at random to represent the whole.
(3) The number of survey samples selected is determined by scientific calculation according to the requirements of survey error, and there is a reliable guarantee on the number of survey samples.
(4) The error of sampling survey can be calculated according to the number of survey samples and the degree of difference between the units in the population before the survey, and controlled within the allowable range. The accuracy of the survey results is high.
Based on the above characteristics, sampling survey is recognized as Incomplete investigation Among the methods, the most complete and scientific investigation method is used to calculate and represent the overall population. [1]


The general steps of sampling survey are as follows:
(1) Define the overall
(2) Development Sampling frame
(3) Conduct sampling survey and estimate the population
(4) Split population
(5) Determine sample size
(6) Determine sampling method
(7) Determine the reliability and validity of the survey [1]

Scope of application

First, things that cannot be comprehensively investigated. Some things are destructive when measuring or testing, and it is impossible to conduct a comprehensive investigation. For example, the anti-seismic capability test of TV, the durability test of bulb, etc.
Second, there are some things that can be comprehensively investigated theoretically, but can not be comprehensively investigated actually. For example, learn how many trees there are in a forest, and how the living conditions of employees' families are.
Third, sampling survey method can be used for quality control in industrial production process.
Fourth, using the method of sampling inference, we can test the hypothesis of a certain population to judge the truth of this hypothesis and decide whether to accept or reject it. [1]



Probabilistic sampling

1. Random sampling - simple random sampling method
This is the simplest one-step sampling method. It selects the sampling unit from the population, and every possible sample from the population is equally selected probability At the time of sampling Sampling population The sampling units in are coded as 1~n, and then random number The code table or special computer program determines the random numbers between 1 and n, and the units that match the random numbers in the population become the samples of random sampling.
such sampling method Simple, error analysis is easy, but it needs sample size More, applicable to the situation where the difference between individuals is small.
2. Random sampling—— Systematic sampling
This method, also known as sequential sampling method, is to take samples from random points in the population at certain intervals (i.e. "every few"). The advantage of this method is sampling Sample distribution Good, with good theory, the overall estimate is easy to calculate.
3. Random sampling—— Stratified sampling
It divides the whole population into several homogeneous and non overlapping layers according to certain specific characteristics, and then draws samples from each layer independently Probabilistic sampling Stratified sampling utilization Auxiliary information Layered, each layer should be homogeneous, and the differences between layers should be as large as possible. In this way stratified sampling It can improve the representativeness and overall quality of samples Estimated value Precision and Sampling plan The sampling operation and management are relatively convenient. But the sampling frame is more complex, the cost is higher, and the error analysis is also more complex. This method is applicable to the situation where the mother is complex, the difference between individuals is large, and the number is large.
4. Random sampling - cluster sampling
Cluster sampling It is to group the overall units first, which can be grouped according to nature or needs. In the traffic survey, it can be grouped according to geographical characteristics, and randomly select groups as sampling samples to investigate all units in the sample group. The cluster sampling sample is relatively concentrated, which can reduce the investigation cost. For example, in the travel survey of residents, this method can be used to group residents according to different residential areas, and then randomly select groups as samples. This method has the advantage of simple organization and the disadvantage of poor sample representation.
5. Random sampling - multi-stage sampling method
Multistage sampling It is an unequal probability sampling that takes two or more consecutive stages to sample. The units sampled by stages are graded Sampling unit The structure is also different. The sample distribution of multi-stage sampling is centralized, which can save time and money. The organization of the survey is complex, and the calculation of the overall estimate is complex.
6. Random sampling - equidistant sampling
systematic sampling Also known as systematic sampling or mechanical sampling, it is a sampling method that first arranges the units in the population in a certain order, determines the sampling interval according to the sample size requirements, then randomly determines the starting point, and samples a unit at a certain interval.
According to the overall unit arrangement method, the unit arrangement of equidistant sampling can be divided into three categories: queuing according to relevant signs, queuing according to irrelevant signs, and queuing according to natural state between queuing according to relevant signs and queuing according to irrelevant signs.
According to the specific practice of equidistant sampling, equidistant sampling can be divided into three categories: straight line equidistant sampling, symmetric equidistant sampling and circular equidistant sampling.
The main advantage of isometric sampling is that it is simple and easy to do, and when you have a certain understanding of the overall structure, you can make full use of the existing information to queue up the overall units and then sample again, which can improve the sampling efficiency.
7. Random sampling - double sampling
Double sampling , also known as double sampling and double sampling, refers to a sampling method of taking samples twice during sampling, specifically: first, take a preliminary sample, and search for some simple items to obtain information about the population; Then, on this basis, further sampling is carried out. In practice, double sampling can be extended to multiple sampling.
8. Random sampling - probability sampling proportional to size
Probabilistic sampling in proportion to size PPS sampling It is a sampling method that uses auxiliary information so that each unit has a proportional probability of being selected according to its size. The methods of sample selection include Hansen Herwitz method and Rahili method.
The main advantages of PPS sampling are: auxiliary information is used to reduce sampling error; The main disadvantages are: high requirements for auxiliary information, complex estimation of variance, etc.
9. Random sampling - random sampling
Randomly select survey units for survey (different from random sampling, it does not guarantee that every unit has the same chance to be selected), such as counter visitor survey and roadside stop survey.
10. Non random sampling - focused sampling
It only affects a small number of people in the population but has a great impact (the mark value accounts for a large proportion in the population) key unit Survey.
11. Non random sampling - typical sampling
Select a number of representative units for research.
12. Non random sampling - quota sampling
When the population is classified and the sample size is fixed, the survey units are selected from each part of the population according to the quota. [1]

Non probabilistic sampling

Non probabilistic sampling It is the method that the investigator draws samples according to his own convenience or subjective judgment.
It is not strictly random sampling The principle of sampling, so it lost Law of large numbers The existence basis of Sampling error , cannot correctly describe the extent to which the statistical value of the sample is suitable for the population. Although the results of sample survey can also explain the nature and characteristics of the population to a certain extent, the population cannot be inferred from the quantity. [1]

Common nouns

In sampling survey, the commonly used terms are:
1. Overall
Overall refers to all the subjects to be studied. It is a collection of all the subjects to be investigated according to a certain research purpose, and the subjects that constitute the whole are called Overall unit
2. Individual
Individual refers to each object of investigation in the whole.
3. Sample
A sample is a part of a population. It is a collection of population units selected from the population according to a certain procedure.
4. Sample capacity
The number of individuals in the sample is called the sample capacity.
The sampling frame refers to a frame used to represent the population and select samples from it. Its specific manifestations mainly include the list and map of all units of the population.
Sampling frame is a basic part of sampling survey, and it has a considerable impact on the inference of population.
6. Sampling ratio
Sampling ratio refers to the ratio between the number of sample units taken and Overall unit Ratio of numbers.
For sampling survey, how representative is the sample Estimated value The authenticity first depends on the quality of the sampling frame.
Confidence is also called reliability, or confidence level , confidence coefficient, that is Overall parameters When making estimates, due to the Randomness The conclusion is always uncertain. Therefore, a probability statement method is adopted, that is mathematical statistics In interval estimation Method, that is, if the estimated value and the overall parameters are within a certain allowable error range, what is the corresponding probability? This corresponding probability is called confidence.
In sampling survey, the estimated value of a sample is usually used to estimate a certain characteristic of the population. When the two are inconsistent, errors will occur. Because the estimated value made by the sample changes with the sample selected, even if the observation is completely correct, there is often a difference between it and the overall indicators. This difference is purely caused by sampling, so it is called sampling error.
9. Deviation
The so-called deviation, also known as error, usually refers to the deviation caused by various reasons in sampling survey, except sampling error.
10. Mean square error
When sampling survey is used to estimate an indicator of the population, it is necessary to adopt certain sampling methods and select appropriate Estimator When the sampling method and estimator are determined Deviation Squared mean value That is, the mean square deviation. [1]