Elle O’Brien

By |

My research focuses on building infrastructure for public health and health science research organizations to take advantage of cloud computing, strong software engineering practices, and MLOps (machine learning operations). By equipping biomedical research groups with tools that facilitate automation, better documentation, and portable code, we can improve the reproducibility and rigor of science while scaling up the kind of data collection and analysis possible.

Research topics include:
1. Open source software and cloud infrastructure for research,
2. Software development practices and conventions that work for academic units, like labs or research centers, and
3. The organizational factors that encourage best practices in reproducibility, data management, and transparency

The practice of science is a tug of war between competing incentives: the drive to do a lot fast, and the need to generate reproducible work. As data grows in size, code increases in complexity and the number of collaborators and institutions involved goes up, it becomes harder to preserve all the “artifacts” needed to understand and recreate your own work. Technical AND cultural solutions will be needed to keep data-centric research rigorous, shareable, and transparent to the broader scientific community.

View MIDAS Faculty Research Pitch, Fall 2021

 

Lia Corrales

By |

My PhD research focused on identifying the size and mineralogical composition of interstellar dust through X-ray imaging of dust scattering halos to X-ray spectroscopy of bright objects to study absorption from intervening material. Over the course of my PhD I also developed an open source, object oriented approach to computing extinction properties of particles in Python that allows the user to change the scattering physics models and composition properties of dust grains very easily. In many cases, the signal I look for from interstellar dust requires evaluating the observational data on the 1-5% level. This has required me to develop a deep understanding of both the instrument and the counting statistics (because modern-day X-ray instruments are photon counting tools). My expertise led me to a postdoc at MIT, where I developed techniques to obtain high resolution X-ray spectra from low surface brightness (high background) sources imaged with the Chandra X-ray Observatory High Energy Transmission Grating Spectrometer. I pioneered these techniques in order to extract and analyze the high resolution spectrum of Sgr A*, our Galaxy’s central supermassive black hole (SMBH), producing a legacy dataset with a precision that will not be replaceable for decades. This dataset will be used to understand why Sgr A* is anomalously inactive, giving us clues to the connection between SMBH activity and galactic evolution. In order to publish the work, I developed an open source software package, pyXsis (github.com/eblur/pyxsis) in order to model the low signal-to-noise spectrum of Sgr A* simultaneously with a non-physical parameteric model of the background spectrum (Corrales et al., 2020). As a result of my vocal advocacy for Python compatible software tools and a modular approach to X-ray data analysis, I became Chair for HEACIT (which stands for “High Energy Astrophysics Codes, Interfaces, and Tools”), a new self-appointed working group of X-ray software engineers and early career scientists interested in developing tools for future X-ray observatories. We are working to identify science cases that high energy astronomers find difficult to support with the current software libraries, provide a central and publicly available online forum for tutorials and discussion of current software libraries, and develop a set of best practices for X-ray data analysis. My research focus is now turning to exoplanet atmospheres, where I hope to measure absorption from molecules and aerosols in the UV. Utilizing UM access to the Neil Gehrels Swift Observatory, I work to observe the dip in a star’s brightness caused by occultation (transit) from a foreground planet. Transit depths are typically <1%, and telescopes like Swift were not originally designed with transit measurements (i.e., this level of precision) in mind. As a result, this research strongly depends on robust methods of scientific inference from noisy datasets.

cirx1_heinz_pretty_image

As a graduate student, I attended some of the early “Python in Astronomy” workshops. While there, I wrote Jupyter Notebook tutorials that helped launch the Astropy Tutorials project (github.com/astropy/astropy-tutorials), which expanded to Learn Astropy (learn.astropy.org), for which I am a lead developer. Since then, I have also become a leader within the larger Astropy collaboration. I have helped develop the Astropy Project governance structure, hired maintainers, organized workshops, and maintained an AAS presence for the Astropy Project and NumFocus (the non-profit umbrella organization that works to sustain open source software communities in scientific computing) for the last several years. As a woman of color in a STEM field, I work to clear a path by teaching the skills I have learned along the way to other underrepresented groups in STEM. This year I piloted WoCCode (Women of Color Code), an online network and webinar series for women from minoritized backgrounds to share expertise and support each other in contributing to open source software communities.

Sardar Ansari

By |

I build data science tools to address challenges in medicine and clinical care. Specifically, I apply signal processing, image processing and machine learning techniques, including deep convolutional and recurrent neural networks and natural language processing, to aid diagnosis, prognosis and treatment of patients with acute and chronic conditions. In addition, I conduct research on novel approaches to represent clinical data and combine supervised and unsupervised methods to improve model performance and reduce the labeling burden. Another active area of my research is design, implementation and utilization of novel wearable devices for non-invasive patient monitoring in hospital and at home. This includes integration of the information that is measured by wearables with the data available in the electronic health records, including medical codes, waveforms and images, among others. Another area of my research involves linear, non-linear and discrete optimization and queuing theory to build new solutions for healthcare logistic planning, including stochastic approximation methods to model complex systems such as dispatch policies for emergency systems with multi-server dispatches, variable server load, multiple priority levels, etc.

Xianglei Huang

By |

Prof. Huang is specialized in satellite remote sensing, atmospheric radiation, and climate modeling. Optimization, pattern analysis, and dimensional reduction are extensively used in his research for explaining observed spectrally resolved infrared spectra, estimating geophysical parameters from such hyperspectral observations, and deducing human influence on the climate in the presence of natural variability of the climate system. His group has also developed a deep-learning model to make a data-driven solar forecast model for use in the renewable energy sector.

Andrew Brouwer

By |

Andrew uses mathematical and statistical modeling to address public health problems. As a mathematical epidemiologist, he works on a wide range of topics (mostly related to infectious diseases and cancer prevention and survival) using an array of computational and statistical tools, including mechanistic differential equations and multistate stochastic processes. Rigorous consideration of parameter identifiability, parameter estimation, and uncertainty quantification are underlying themes in Andrew’s work.

Mithun Chakraborty

By |

My broad research interests are in multi-agent systems, computational economics and finance, and artificial intelligence. I apply techniques from algorithmic game theory, statistical machine learning, decision theory, etc. to a variety of problems at the intersection of the computational and social sciences. A major focus of my research has been the design and analysis of market-making algorithms for financial markets and, in particular, prediction markets — incentive-based mechanisms for aggregating data in the form of private beliefs about uncertain events (e.g. the outcome of an election) distributed among strategic agents. I use both analytical and simulation-based methods to investigate the impact of factors such as wealth, risk attitude, manipulative behavior, etc. on information aggregation in market ecosystems. Another line of work I am pursuing involves algorithms for allocating resources based on preference data collected from potential recipients, satisfying efficiency, fairness, and diversity criteria; my joint work on ethnicity quotas in Singapore public housing allocation deserves special mention in this vein. More recently, I have got involved in research on empirical game-theoretic analysis, a family of methods for building tractable models of complex, procedurally defined games from empirical/simulated payoff data and using them to reason about game outcomes.

Catherine Hausman

By |

Catherine H. Hausman is an Associate Professor in the School of Public Policy and a Research Associate at the National Bureau of Economic Research. She uses causal inference, related statistical methods, and microeconomic modeling to answer questions at the intersection of energy markets, environmental quality, climate change, and public policy.

Recent projects have looked at inequality and environmental quality, the natural gas sector’s role in methane leaks, the impact of climate change on the electricity grid, and the effects of nuclear power plant closures. Her research has appeared in the American Economic Journal: Applied Economics, the American Economic Journal: Economic Policy, the Brookings Papers on Economic Activity, and the Proceedings of the National Academy of Sciences.

Annette Ostling

By |

Biodiversity in nature can be puzzlingly high in the light of competition between species, which arguably should eventually result in a single winner. The coexistence mechanisms that allow for this biodiversity shape the dynamics of communities and ecosystems. My research focuses on understanding the mechanisms of competitive coexistence, how competition influences community structure and diversity, and what insights observed patterns of community structure might provide about competitive coexistence.

I am interested in the use and development of data science approaches to draw insights regarding coexistence mechanisms from the structural patterns of ecological communities with respect to species’ functional traits, relative abundance, spatial distribution, and phylogenetic relatedness, through as community dynamics proceed. I am also interested in the use of Maximum Likelihood and Bayesian approaches for fitting demographic models to forest census data sets, demographic models that can then be used to quantitatively assess the role of different competitive coexistence mechanisms.

Thomas Schmidt

By |

The current goal of our research is to learn enough about the physiology and ecology of microbes and microbial communities in the gut that we are able to engineer the gut microbiome to improve human health. The first target of our engineering is the production of butyrate – a common fermentation product of some gut microbes that is essential for human health. Butyrate is the preferred energy source for mitochondria in the epithelial cells lining the gut and it also regulates their gene expression.

One of the most effective ways to influence the composition and metabolism of the gut microbiota is through diet. In an interventional study, we have tracked responses in the composition and fermentative metabolism of the gut microtiota in >800 healthy individuals. Emerging patterns suggest several configurations of the microbiome that can result in increased production of butyrate acid. We have isolated the microbes that form an anaerobic food web to convert dietary fiber to butyrate and continue to make discoveries about their physiology and interactions. Based on these results, we have initiated a clinical trial in which we are hoping to prevent the development of Graft versus Host Disease following bone marrow transplants by managing butyrate production by the gut microbiota.

We are also beginning to track hundreds of other metabolites from the gut microbiome that may influence human health. We use metagenomes and metabolomes to identify patterns that link the microbiota with their metabolites and then test those models in human organoids and gnotobiotic mice colonized with synthetic communities of microbes. This blend of wet-lab research in basic microbiology, data science and in ecology is moving us closer to engineering the gut microbiome to improve human health.

Ranjan Pal

By |

Cyber-security is a complex and multi-dimensional research field. My research style comprises an inter-disciplinary (primarily rooted in economics, econometrics, data science (AI/ML/Bayesian and Frequentist Statistics), game theory, and network science) investigation of major socially pressing issues impacting the quality of cyber-risk management in modern networked and distributed engineering systems such as IoT-driven critical infrastructures, cloud-based service networks, and app-based systems (e.g., mobile commerce, smart homes) to name a few. I take delight in proposing data-driven, rigorous, and interdisciplinary solutions to both, existing fundamental challenges that pose a practical bottleneck to (cost) effective cyber-risk management, and futuristic cyber-security and privacy issues that might plague modern (networked) engineering systems. I strongly strive for originality, practical significance, and mathematical rigor in my solutions. One of my primary end goals is to conceptually get arms around complex, multi-dimensional information security and privacy problems in a way that helps, informs, and empowers practitioners and policy makers to take the right steps in making the cyber-space more secure.