Research Spotlight: Leveraging survey methodology to find the “missing data” in COVID-19 infection rates

When the COVID-19 pandemic began escalating in March 2020, accurately predicting and tracking infection rates immediately became a serious challenge because of substantial lapses in key information. A major issue is the selection bias, or “missing data” in reported infection rates. Government and healthcare systems rely on testing data to implement care, but that data is not inherently representative of actual infection rates because of inadequate testing.

To estimate prevalence at scale, researchers usually take a random sample of individuals who tested for COVID-19 and see how many are positive. The problem with this approach is that individuals get tested because they either experience symptoms or have been a close-contact to someone with COVID-19. So the data is potentially skewed and misses asymptomatic patients.

“How can we get a prevalence estimate that can be extrapolated to a population, whether a university or an entire country, if we can’t obtain a true, random sample?” asked Dr. Yajuan Si, a research assistant professor at the Institute for Social Research and the School of Public Health, and a faculty affiliate at the Michigan Institute for Data Science (MIDAS).,

Si began focusing on this problem early on in the pandemic, helping to start a working group with other MIDAS faculty focused on missing data and selection bias in these population studies. From this working group came a collaboration between Si and Dr. Jacob Fisher, and they were awarded a MIDAS Propelling Original Data Science (PODS) grant in a special round of funding for COVID research. Based on preliminary results from this PODS grant, they recently received a $2.3 million grant from the National Institutes of Health (NIH) to address selection bias.

The NIH funding will allow Si to further develop her methodology of using survey data to develop metrics that estimate the true viral incidence and naturally/vaccine-acquired immunity prevalence in the community, examine the health disparities and social inequality, and monitor the epidemic over time as an operational surveillance system.

This grant is part of  a larger program funded by  NIH, which brings together teams from across 10 universities to pool knowledge and data. This consortium will work in tandem for the next four years, developing resources and interfaces that will reflect on the COVID-19 pandemic and better inform future outcomes and policy for how to respond should a new pandemic take shape.

“As a junior faculty member, having support from MIDAS was so valuable because it helped me find an important research direction.” Si said. “Both the PODS program and the working group directly allowed me the opportunity to continue my research with this significant grant award.”

If you are curious about learning more, Dr. Si recently penned an article about her research for the Washington Post.