One of the challenges facing social scientists is that our understanding of how social and political processes operate and what their consequences are has lost some of its predictive power, such as our failure to predict election outcomes. This phenomenon raises questions of whether theories and models developed in the past – among a different generation living in a different cultural and technological setting – apply in the current environment. Concurrently, the abundance of online, social media data provides the social scientists with great opportunities to understand today’s social and political phenomena. To use such opportunities, however, important issues on how to process and use social media need to be addressed. Such issues include whether social media users are representative of the population at large and whether they are honest and open, as well as whether the collection and processing of data are unbiased and accurate to allow the construction of inferences about populations.
The research team will carry out a few parallel projects with the unifying theme of integrating geospatial, social media and spatial data to address research and methodological questions. One project is about communication patterns and their effects on political choices and behavior in the 2016 presidential election. The second project investigates online and Twitter communication about parenting information and misinformation. A third project will investigate a variety of methodological issues associated with inferences drawn from probability-based and nonprobability-based social surveys and from social media. The three projects will employ methods of cross-validation of survey data, social media, and administrative records and investigate the social network dynamics of elites and the general public. The research team will develop procedures for extracting meaning from large collections of text to connect with public attitudes about important political and policy issues of the day. They will also develop visualization techniques for dimensionality reduction, while expanding upon existing systems for data mining and statistical inference.
The project is a collaboration between researchers from multiple units at the University of Michigan and at Georgetown University, and the team will also engage researchers at Gallup. This set of projects will become the locus for multidisciplinary efforts between social scientists, computer scientists, and statisticians at both institutions, and each university will become the locus for future extended work of this kind. The data science tools developed through this set of projects will also have wide application to other research questions in social science.