The accuracy of school indicators depends on the reliability of the measure used and the size and representativeness of the sample of children that completes the test or survey.
Reliability refers to the consistency of a measurement process. In the case of a measure of student anxiety, for example, would the results be the same if the survey were administered at different times of the day or on different days of the week? Instruments tend to be more reliable when they include a larger sample of questions or observations and when there are prescribed methods for collecting data, coding responses, and assigning scores. Reliability is usually assessed with a coefficient that ranges from 0 to 1.0, with 1.0 being perfectly reliable. Most of the measures used in OurSCHOOL have reliability coefficients ranging from 0.80 to 0.95, which is suitable for guiding school policy and practice.
The accuracy of school indicators also depends on the size of the sample. For the measures used in OurSCHOOL we have a reasonably accurate result for most measures after 30 students complete the survey. When we have a larger sample size, 100 students for example, the results are slightly more accurate. The size of the sampling error is a function of 1/n² and therefore with increasing the sample size there are diminishing returns in reducing the size of the error. The same argument applies to drill-downs; the reported estimate scores are reasonably accurate for most measures as long as at least 30 student responses are used to estimate the results. In OurSCHOOL the minimum number required for reporting a result is set at 5, and therefore for drill-downs the results for small groups are not as accurate.
Results can also be inaccurate if the sample of students that completes the survey is not representative of the school population. This is called bias. For example, if students who are absent on the day the survey is administered tend to be from lower socioeconomic backgrounds, then the sample is not representative of the full school population. If this is the case, the results are likely to be biased. Schools that do ‘classroom sampling’ rely on the representativeness of the classes chosen, but if there is any streaming in the school, then the results for a particular month can be biased.
Considering these factors that contribute to the accuracy of results, schools should try to mitigate bias as much as possible by tracking down kids who were absent when the survey was administered. The accuracy of reported measures is best achieved with a sample of 100 or more, but a sample of 30 students provides a reasonable estimate for most purposes. When interpreting results for small samples, especially for drill-downs, the results need to be interpreted with caution.