My research focus is on the development and application of machine learning tools to large scale financial and unstructured (textual) data to extract, quantify and predict risk profiles and investment grade rating of private and public companies. Example datasets include social media and financial aggregators such as Bloomberg, Pitchbook, and Privco.
Jon Zelner, PhD, is Assistant Professor in the department of Epidemiology in the University of Michigan School of Public Health. Dr. Zelner holds a second appointment in the Center for Social Epidemiology and Population Health.
Dr. Zelner’s research is focused on using spatial analysis, social network analyisis and dynamic modeling to prevent infectious diseases, with a focus on tuberculosis and diarrheal disease. Jon is also interested in understanding how the social and biological causes of illness interact to generate observable patterns of disease in space and in social networks, across outcomes ranging from infection to mental illness.
The goal of my research is to leverage network analysis techniques to uncover how the brain mediates sex hormone influences on gendered behavior across the lifespan. Specifically, my data science research concerns the creation and application of person-specific connectivity analyses, such as unified structural equation models, to time series data; these are intensive longitudinal data, including functional neuroimages, daily diaries, and observations. I then use these data science methods to investigate the links between androgens (e.g., testosterone) and estradiol at key developmental periods, such as puberty, and behaviors that typically show sex differences, including aspects of cognition and psychopathology.
Kai S. Cortina, PhD, is Professor of Psychology in the College of Literature, Science, and the Arts at the University of Michigan, Ann Arbor.
Prof. Cortina’s major research revolves around the understanding of children’s and adolescents’ pathways into adulthood and the role of the educational system in this process. The academic and psycho-social development is analyzed from a life-span perspective exclusively analyzing longitudinal data over longer periods of time (e.g., from middle school to young adulthood). The hierarchical structure of the school system (student/classroom/school/district/state/nations) requires the use of statistical tools that can handle these kind of nested data.
Mingyan Liu, PhD, is Professor of Electrical Engineering and Computer Science, College of Engineering, at the University of Michigan, Ann Arbor.
Prof. Liu’s research interest lies in optimal resource allocation, sequential decision theory, online and machine learning, performance modeling, analysis, and design of large-scale, decentralized, stochastic and networked systems, using tools including stochastic control, optimization, game theory and mechanism design. Her most recent research activities involve sequential learning, modeling and mining of large scale Internet measurement data concerning cyber security, and incentive mechanisms for inter-dependent security games. Within this context, her research group is actively working on the following directions.
1. Cyber security incident forecast. The goal is to predict an organization’s likelihood of having a cyber security incident in the near future using a variety of externally collected Internet measurement data, some of which capture active maliciousness (e.g., spam and phishing/malware activities) while others capture more latent factors (e.g., misconfiguration and mismanagement). While machine learning techniques have been extensively used for detection in the cyber security literature, using them for prediction has rarely been done. This is the first study on the prediction of broad categories of security incidents on an organizational level. Our work to date shows that with the right choice of feature set, highly accurate predictions can be achieved with a forecasting window of 6-12 months. Given the increasing amount of high profile security incidents (Target, Home Depot, JP Morgan Chase, and Anthem, just to name a few) and the amount of social and economic cost they inflict, this work will have a major impact on cyber security risk management.
2. Detect propagation in temporal data and its application to identifying phishing activities. Phishing activities propagate from one network to another in a highly regular fashion, a phenomenon known as fast-flux, though how the destination networks are chosen by the malicious campaign remains unknown. An interesting challenge arises as to whether one can use community detection methods to automatically extract those networks involved in a single phishing campaign; the ability to do so would be critical to forensic analysis. While there have been many results on detecting communities defined as subsets of relatively strongly connected entities, the phishing activity exhibits a unique propagating property that is better captured using an epidemic model. By using a combination of epidemic modeling and regression we can identify this type of propagating community with reasonable accuracy; we are working on alternative methods as well.
3. Data-driven modeling of organizational and end-user security posture. We are working to build models that accurately capture the cyber security postures of end-users as well as organizations, using large quantities of Internet measurement data. One domain is on how software vendors disclose security vulnerabilities in their products, how they deploy software upgrades and patches, and in turn, how end users install these patches; all these elements combined lead to a better understanding of the overall state of vulnerability of a given machine and how that relates to user behaviors. Another domain concerns the interconnectedness of today’s Internet which implies that what we see from one network is inevitably related to others. We use this connection to gain better insight into the conditions of not just a single network viewed in isolation, but multiple networks viewed together.
Using GIS, visual analytics, and spatiotemporal modeling, Dr. Rybarczyk examines the utility of Big Data for gaining insight into the causal mechanisms that influence travel patterns and urban dynamics. In particular, his research sets out to provide a fuller understanding of â€œwhatâ€ and â€œwhereâ€ micro-scale conditions affect human sentiment and hence wayfinding ability, movement patterns, and travel mode-choices.
Recent works: Rybarczyk, G. and S. Banerjee. (2015) Visualizing active travel sentiment in an urban context, Journal of Transport and Health, 2(2): 30
Nils G. Walter, PhD, is the Francis S. Collins Collegiate Professor of Chemistry, Biophysics and Biological Chemistry, College of Literature, Science, and the Arts and Professor of Biological Chemistry, Medical School, at the University of Michigan, Ann Arbor.
Nature and Nanotechnology likewise employ nanoscale machines that self-assemble into structures of complex architecture and functionality. Fluorescence microscopy offers a non-invasive tool to probe and ultimately dissect and control these nanoassemblies in real-time. In particular, single molecule fluorescence resonance energy transfer (smFRET) allows us to measure distances at the 2-8 nm scale, whereas complementary super-resolution localization techniques based on Gaussian fitting of imaged point spread functions (PSFs) measure distances in the 10 nm and longer range. In terms of Big Data Analysis, we have developed a method for the intracellular single molecule, high-resolution localization and counting (iSHiRLoC) of microRNAs (miRNAs), a large group of gene silencers with profound roles in our body, from stem cell development to cancer. Microinjected, singly-fluorophore labeled, functional miRNAs are tracked at super-resolution within individual diffusing particles. Observed mobility and mRNA dependent assembly changes suggest the existence of two kinetically distinct assembly processes. We are currently feeding these data into a single molecule systems biology pipeline to bring into focus the unifying molecular mechanism of such a ubiquitous gene regulatory pathway. In addition, we are using cluster analysis of smFRET time traces to show that large RNA processing machines such as single spliceosomes – responsible for the accurate removal of all intervening sequences (introns) in pre-messenger RNAs – are working as biased Brownian ratchet machines. On the opposite end of the application spectrum, we utilize smFRET and super-resolution fluorescence microscopy to monitor enhanced enzyme cascades and nanorobots engineered to self-assemble and function on DNA origami.
Professor Owen-Smith conducts research on the collective dynamics of large scale networks and their implications for scientific and technological innovation and surgical care. He is the executive director of the Institution for Research on Innovation and Science (IRIS, http://iris.isr.umich.edu). IRIS is a national consortium of research universities who share data and support infrastructure designed to support research to understand, explain, and eventually improve the public value of academic research and research training.