University of Michigan researchers can access a compilation of tweets known as the “U.S. and India Politicians Dataset” without charge. Anmol Panda (firstname.lastname@example.org) and Libby Hemphill (email@example.com) in the School of Information collect and maintain the data and its associated metadata. Additional information about the collection process, the metadata provided, and how to use the data is available on GitHub. The dataset currently includes over 8000 elected officials and candidates for public office from the U.S. and over 30,000 politicians and celebrities who talk about politics in India. The dataset is useful for studying issues of political communication, political debate, campaigning, and political speech. MIDAS, CSCAR and ARC together manage and support the use of this data repository. This dataset is currently only available to UM and sponsored affiliates at this time.
Coderspaces Office Hours – Free analytical consulting
HPC Training Videos – Training videos on how to use Great Lakes Platform as well as other resources
Decahose with Great Lakes (Github) – Tutorial for using Twitter data with PySpark on the Great Lakes HPC
Decahose Filter (Github) – Tutorial for using command line interface with batch jobs