Explore ARCExplore ARC

Yongsheng Bai

By |

Dr. Bai’s research interests lie in development and refinement of bioinformatics algorithms/software and databases on next-generation sequencing (NGS data), development of statistical model for solving biological problems, bioinformatics analysis of clinical data, as well as other topics including, but not limited to, uncovering disease genes and variants using informatics approaches, computational analysis of cis-regulation and comparative motif finding, large-scale genome annotation, comparative “omics”, and evolutionary genomics.

Ginger Shultz

By |

The Shultz group uses data science methods in two primary ways 1) to investigate student placement in introductory chemistry courses and 2) to analyze student texts to provide instructors actionable intelligence about student learning. Using regression discontinuity we investigated the impact of taking general chemistry prior to organic chemistry on student performance and persistence in later chemistry courses and found that students who took general chemistry first benefitted by 1/4 of a letter grade but were no more likely to persist. A continued investigation using survey and interview methods indicated that this was related to academic skills rather than content preparation. Through the MWrite project we have collected a large corpus of student texts and are developing automated text analysis methods to glean information about student learning across disciplines, with specific focus on scientific reasoning.

Network representation of writing moves made by students in argumentative writing with relevant transition probabilities. The size of the node represents the relative frequency of operation use and the edge labels represent the transition probability with key transitions highlighted in orange.

James R. Hines Jr.

By |

Professor Hines’ research focuses on the analysis of the donative behavior of Americans, and how it affects the intergenerational and interpersonal transmission of economic well-being. To what extent do parents leave property to their children and others, and how is this behavior affected by legal institutions, taxes, social norms, and other considerations? While there are no comprehensive sources of data on wills, trusts, lifetime gifts, and other forms of property transmission, there is ample available information from legal documents that with the help of natural language processing can hopefully be coded and analyzed in a systematic way.

Somangshu Mukherji

By |

Somangshu (Sam) Mukherji, PhD, is Assistant Professor of Music Theory in the School of Music, Theatre & Dance at the University of Michigan, Ann Arbor.

Sam Mukherji‘s work lies at the interface of traditional Western tonal theory, the theory and practice of popular and non-Western idioms, and the cognitive science of music. Within this framework, the main focus of his research has been on the prolongational, grammatical aspects of Western tonality, and their connection to the tonal structures of Indian music, and the blues-based traditions within rock and metal. This emphasis makes his work similar to that of a linguist who explores relationships between the world’s languages-and, therefore, Mukherji’s research has been influenced in particular by ideas from linguistic theory as well, especially the Minimalist Program in contemporary generative linguistics. For this reason, he has investigated connections not only between different musical idioms but also between music and language-and musical and linguistic theory-more generally. Much of his work explores overlaps between Minimalist linguistics, and related, generative approaches within music theory (such as those found in the writings of Heinrich Schenker), and he has also written extensively about what such ‘musicolinguistic’ connections imply for the wider study of human musical behavior, cognition, and evolution.

Rocio Titiunik

By |

Prof. Titiunik’s research interests lie primarily in quantitative methodology for the social sciences, with emphasis on quasi-experimental methods for causal inference and political methodology. She is particularly interested in the application and development of non-experimental methods for the study of political institutions, a methodological agenda that is motivated by her substantive interests on democratic accountability and the role of party systems in developing democracies. Some of her current projects include the application of web scraping and text analysis tools to measure political phenomena.

Derek Harmon

By |

My research focuses on the intended and unintended consequences of language in financial markets. I examine this relationship across a number of contexts, such as the Federal Reserve, initial public offerings, and mergers and acquisitions. More broadly, my work aims to develop new theoretical and methodological approaches to understand the role of language in society.

Peter Adriaens

By |

My research focus is on the development and application of machine learning tools to large scale financial and unstructured (textual) data to extract, quantify and predict risk profiles and investment grade rating of private and public companies.  Example datasets include social media and financial aggregators such as Bloomberg, Pitchbook, and Privco.

V. G. Vinod Vydiswaran

By |

V.G.Vinod Vydiswaran, PhD, is Assistant Professor in the Department of Learning Health Sciences with a secondary appointment in the School of Information at the University of Michigan, Ann Arbor.

Dr. Vydiswaran’s research focuses on developing and applying text mining, natural language processing, and machine learning methodologies for extracting relevant information from health-related text corpora. This includes medically relevant information from clinical notes and biomedical literature, and studying the information quality and credibility of online health communication (via health forums and tweets). His previous work includes developing novel information retrieval models to assist clinical decision making, modeling information trustworthiness, and addressing the vocabulary gap between health professionals and  laypersons.

Andrew Grogan-Kaylor

By |

My core intellectual interest is the way in which parenting behaviors, like the use of physical punishment, or parental expressions of emotional warmth, have an effect on child outcomes like aggression, antisocial behavior, anxiety and depression, and how these dynamics play out across contexts, neighborhoods, and cultures.  A lot of my work is done with international samples. In my work I use statistical models, like multilevel models and some econometric models, and software like Stata, R, HLM and ArcGIS, to examine things like growth and change over time, or community, school or parent effects on children and families.  I have emerging interests in text-mining and natural language processing.

Visualization of multilevel modeling using High School and Beyond data set.

Visualization of multilevel modeling using High School and Beyond data set.

Steven J. Katz

By |

Dr. Katz’s research addresses cancer treatment communication, decision-making, and quality of care. His work aims to examine the dynamics of how precision medicine presents itself in the exam room via provider and patient communication and shared decision-making. Dr. Katz leads the Cancer Surveillance and Outcomes Research Team (CanSORT), an interdisciplinary research program centered at the University of Michigan and focused on population and intervention studies of the quality of care and outcomes of cancer detection and treatment in diverse populations.  Dr. Katz and CanSORT have been collaborating with Surveillance, Epidemiology, and End Results (SEER) cancer registries since 2002 to study breast cancer treatment decision making at the population level. We obtain patient clinical and demographic information from SEER and combine this with surveys of patients and physicians to create comprehensive data sets that enable us to study testing and treatment trends and the challenges of individualizing treatments for breast cancer patients. In 2015 we added a new dimension to our research by partnering with evaluative testing firms to obtain tumor genomic and germline genetic test results for over 30,000 breast and ovarian cancer patients in the states of California and Georgia. We are also pursuing insurance claims data to assist with our analysis of physician network effects.

Steven Katz, MD discusses BRCA and multigene sequence testing at the labs of Ambry Genetics.

Steven Katz, MD discusses BRCA and multigene sequence testing at the labs of Ambry Genetics.