University of Michigan researchers can access a compilation of tweets known as the “U.S. and India Politicians Dataset” without charge. Anmol Panda ( and Libby Hemphill ( in the School of Information collect and maintain the data and its associated metadata. Additional information about the collection process, the metadata provided, and how to use the data is available on GitHub. The dataset currently includes over 8000 elected officials and candidates for public office from the U.S. and over 30,000 politicians and celebrities who talk about politics in India. The dataset is useful for studying issues of political communication, political debate, campaigning, and political speech. MIDAS, CSCAR and ARC together manage and support the use of this data repository. This dataset is currently only available to UM and sponsored affiliates at this time.

Current Projects:
Twitter Style: What Congress’s Tweets Reveal About Their Communication Personas, Libby Hemphill and Anmol Panda
Candidates Efforts to Reach Latino Voters in the 2020 U.S. Election, Libby Hemphill and Anmol Panda


Coderspaces Office Hours – Free analytical consulting

HPC Training Videos – Training videos on how to use Great Lakes Platform as well as other resources

Decahose with Great Lakes (Github) – Tutorial for using Twitter data with PySpark on the Great Lakes HPC

Decahose Filter (Github) – Tutorial for using command line interface with batch jobs