Non-Sampling Risk
Non-sampling risk is a type of statistical error that occurs during the data collection and interpretation process which isn’t related to the act of selecting a sample from the population. While sampling risk comes from uncertainty inherent in using a randomly selected sample rather than conducting a complete census of the population, non-sampling risk is a result of other aspects of the survey process.
Several factors can contribute to non-sampling risk, including:
- Data Collection Errors: These can occur if the survey questions are confusing or misleading, if respondents do not answer truthfully, or if there are errors in recording the responses.
- Non-Response Errors: If certain groups of people are less likely to respond to the survey, this can introduce bias into the results.
- Data Processing Errors: Mistakes can also be made in the coding, data entry, or analysis phases of the project.
- Model Specification Errors: If the statistical model used to interpret the data does not fit the actual underlying relationship between the variables, the results can be misleading.
It’s important to note that non-sampling risk can’t be reduced by increasing the sample size. Instead, it’s mitigated through careful survey design, rigorous data collection practices, and accurate processing and analysis of the collected data.
Example of Non-Sampling Risk
Let’s consider a hypothetical scenario where a research company is conducting a survey on the public’s perception of a new product launched by a company.
- Data Collection Errors: The survey includes a question that is worded in a confusing manner, causing some respondents to misunderstand what is being asked. As a result, they provide answers that do not accurately reflect their perceptions, introducing non-sampling risk.
- Non-Response Errors: Suppose the survey is distributed online, and older individuals who are less comfortable with technology are underrepresented in the responses. If these individuals have a significantly different perception of the product, this could bias the survey results, again a non-sampling risk.
- Data Processing Errors: During the data entry process, a researcher accidentally codes some of the “very positive” responses as “very negative.” This error would distort the findings, adding non-sampling risk.
- Model Specification Errors: The research company uses a statistical model to analyze the data that assumes a linear relationship between age and perception of the product. However, in reality, the relationship isn’t linear (maybe it’s more of a U-shape, with younger and older individuals having more positive perceptions than middle-aged individuals). This incorrect model specification is another source of non-sampling risk.
In all these cases, the errors have nothing to do with the method used to select the sample from the population. Instead, they’re related to other aspects of the survey design and execution process, illustrating the concept of non-sampling risk.