U-M Annual Data Science & AI Summit 2023
Poster Session
Student Poster Competitor
Presenting author
#1
Fairness via Robust Machine Learning
Tiffany Parise, EECS, University of Michigan
Vinod Raman, Statistics, University of Michigan
Sindhu Kutty, EECS, University of Michigan
Machine learning models are increasingly deployed to aid decisions with significant societal impact. Defining and assessing the degree of fairness of these models, therefore, is both important and urgent. One thread of research in Machine Learning (ML) aims to quantify the fairness of ML models using probabilistic metrics. To ascertain the fairness of a given model, many popular fairness metrics measure the difference in predictive power of that model across different subgroups of a population – typically, where one subgroup has historically been marginalized. A separate thread of research aims to construct robust ML models. Intuitively, robustness may be understood as the ability of a model to perform well even in the presence of noisy data. Typically, robust models are trained by intentionally introducing perturbations in the data. Our work aims to connect these two threads of research. We hypothesize that models trained to be robust are naturally more fair than those trained using standard empirical risk minimization. To what extent are fairness and robustness related? Do some notions of fairness and robustness have a stronger correlation than others? We investigate these questions empirically by setting up experiments to measure the relationship between these concepts. To study trade-offs between robustness, fairness, and nominal accuracy, we use a probabilistically robust learning framework (Robey et. al., 2022) to train classifiers with varying levels of robustness on both real-world and synthetic datasets. We then use widely-used statistical metrics (Barocas et. al., 2019) to evaluate the fairness of these models. Preliminary results indicate that probabilistically robust learning reduces nominal accuracy but increases fairness with respect to the evaluated metrics. The significance of such a trade-off would be the conceptualization of fairness in terms of robustness and the ability to increase model fairness without explicitly optimizing for fairness.Abstract
#3
Towards explainable AI: a novel set of model-agnostic, local and global XAI metrics leveraging feature importance
Bernardo Modenesi, MIDAS, University of Michigan
Kleyton da Costa, Holistic AI
Cristian Muñoz, Holistic AI
Adriano Koshiyama, Holistic AI
This paper presents novel quantitative metrics to explain predictions from machine learning models. The metrics provide insights into model interpretability by leveraging feature importance estimates. In particular, the metrics quantify interpretability based on: 1) the distribution of global and local feature importances, assessing concentration versus spread; 2) the variability of feature impacts on model outputs; and 3) the complexity of feature interactions within model decisions. The metrics are model-agnostic, applicable to any trained classifier or regressor. Experiments on public datasets for credit risk prediction and real estate valuation demonstrate how the metrics facilitate comprehensive understanding of model behaviors. By summarizing feature importance insights into informative metrics, this work enables improved communication about model transparency between stakeholders, besides developing an open-source Python package for its implementation. The results highlight the ability of feature importance-based metrics to increase accountability and trust in artificial intelligence systems.Abstract
#5
GraderGPT : Making GenAI Help Humans Grade Better
Kapotaksha Das, ITS Teaching & Learning, University of Michigan
This work focuses on the development and implementation of a Generative AI tool as part of an internship with ITS Teaching & Learning aimed at assisting graders in constructing improved rubric guidelines and ensuring consistency in grading across multiple sections and graders. The proposed system leverages Canvas data to automatically create custom prompts for each submission without any manual modification. Additionally, OpenAI’s model ChatGPT 3.5 Turbo is utilized to score and comment on a submission based on specified criteria. Scores are then retrieved and applied to normalize the graders’ scores across all submissions. With the help of Z-Scoring and Confidence Intervals, deviations in grading and criteria necessitating improvement of descriptions are identified. The system also helps to pinpoint specific grading criteria that lead to deviations amongst graders, and those that require rephrasing for clarity.Abstract
#7
Typological Analysis of Defensive Publics Posted by Right-wing Twitter Users in India
Anmol Panda, School of Information, University of Michigan
Matthew Bui, School of Information, University of Michigan
Ceren Budak, School of Information, University of Michigan
In this work, we seek to analyze tweets posted with the hashtag #HindusUnderAttack between Jan 2020 to Dec 2022. We study the usage of this hashtag as a defensive publics, defined as a public sphere built by a motivated group of people to defend existing power structures while using anti-establishment narratives. Using this framework, we seek to answer three related questions. First, we report the key actors – politicians, celebrities, and influencers – interacting with the hashtag through tweets and retweets. Second, we seek to report a typology of narratives used by authors of these tweets to understand the key arguments they make to justify using these hashtags. Third, we document the real-world events that trigger this hashtag’s usage and the creation of these defensive publics and situate it in the socio-political context of Indian politics. Our data consisted of the corpus of tweets published by the Mapping AAPI Media Space project at UMass Amherst and tweets from the Politweets archive hosted at SOMAR at ICPSR at the University of Michigan. We used a combination of network science methods and statistical analysis in our study.Abstract
#9
Private Treatment Assignment for Causal Experiments
Jeremy Seeman, MIDAS, University of Michigan
Open science efforts like experimental preregistration help to improve the reproducibility and external validity of causal inferences. However, research involving human data subjects can raise numerous privacy concerns, especially in healthcare contexts where covariates contain personal health information. Existing methods at the intersection of differential privacy (DP) and causal inference traditionally focus on how to calculate causal estimates while satisfying DP. Such work fails to address how transparent experimental designs may themselves leak information about participants, as the probability of treatment assignment often depends on confidential covariate values and their dependencies amongst participants. This work provides tools to publish experimental designs for open science efforts while satisfying DP. First, we show how many experimental designs can leak information about participants. We analyze the design trade-off between observed covariate balance and robustness to worst case (or adversarial) potential outcomes, noting that the latter enables 0-DP algorithms while most designs for the former fail to satisfy DP for any finite privacy loss. We then establish DP alternatives based on discrepancy theory where covariate dependency is algorithmically captured by experimental design decisions and privacy loss bounds. Doing so ensures that privacy loss is spent efficiently relative to practitioner needs for covariate balance. Finally, we discuss how to use these experimental designs in downstream DP inferences with valid, finite sample interval estimators. Such work will allow scientists to integrate privacy protections into end-to-end open science experiments.Abstract
#10
Interpretable prediction of 6-month risk of first acute myocardial infarction without pre-existing cardiovascular condition using outpatient data from Michigan Medicine
Matthew Hodgman, Computational Medicine and Bioinformatics, University of Michigan
Michael Mathis, Computational Medicine and Bioinformatics, University of Michigan
Emily Wittrup, Computational Medicine and Bioinformatics, University of Michigan
Kayvan Najarian, Computational Medicine and Bioinformatics, University of Michigan
Cristian Minoccheri, Computational Medicine and Bioinformatics, University of Michigan
Acute myocardial infarctions are deadly to patients and burdensome to healthcare systems. Most recorded infarctions are patients’ first and occur out of the hospital and are often not accompanied by cardiac comorbidities. The clinical manifestations of the underlying pathophysiology leading to these events are not well understood. Very little effort exists to use explainable machine learning to learn the clinical factors predictive of a patient’s first event using outpatient data. We use electronic health record data of 2148 case and 5268 control patients without cardiac conditions from the Michigan Medicine Health System and an interpretable fuzzy neural network to predict the onset of first acute myocardial infarction within six months. We test various methods for interpretable encoding of patient history, including summary statistics and tensor factorization. We present linguistic rules for predicting six-month risk of first acute myocardial infarction in patients without a pre-existing cardiac condition (AUC=0.695). We suggest that hard-to-predict events, such as acute myocardial infarction can be better understood through the development and use of interpretable models like the employed fuzzy neural network and tensor factorization.Abstract
#13
Differentiable Modeling and Optimization of Battery Electrolyte Mixtures Using Geometric Deep Learning
Shang Zhu, Mechanical Engineering, University of Michigan (Carnegie Mellon University)
Bharath Ramsundar, Deep Forest Sciences
Emil Annevelink, Carnegie Mellon University
Hongyi Lin, Carnegie Mellon University
Adarsh Dave, Carnegie Mellon University
Pin-Wen Guan, Carnegie Mellon University
Kevin Gering, Idaho National Lab
Venkatasubramanian Viswanathan, Carnegie Mellon University
Electrolytes play a critical role in designing next-generation battery systems, by allowing efficient ion transfer, preventing charge transfer, and stabilizing electrode-electrolyte interfaces. In this work, we develop a differentiable geometric deep learning (GDL) model for chemical mixtures, DiffMix, which is applied in guiding robotic experimentation and optimization towards fast-charging battery electrolytes. In particular, we extend mixture thermodynamic and transport laws by creating GDL-learnable physical coefficients. We evaluate our model with mixture thermodynamics and ion transport properties, where we show improved prediction accuracy and model robustness of DiffMix than its purely data-driven variants. Furthermore, with a robotic experimentation setup, Clio, we improve ionic conductivity of electrolytes by over 18.8% within 10 experimental steps, via differentiable optimization built on DiffMix gradients. By combining GDL, mixture physics laws, and robotic experimentation, DiffMix expands the predictive modeling methods for chemical mixtures and enables efficient optimization in large chemical spaces.Abstract
#15
Deep Learning-Based Segmentation of Complex Microparticles in Scanning Electron Microscopy Images
Anastasia Visheratina, MIDAS, University of Michigan
Alexander Visheratin, Beehive AI
Prashant Kumar, Biointerfaces Institute, University of Michigan
Michael Veksler, Biointerfaces Institute, University of Michigan
Nicholas Kotov, Biointerfaces Institute, University of Michigan
Nanoscale chirality is an actively growing research field spurred by the giant chiroptical activity, enantioselective biological activity, and asymmetric catalytic activity of chiral nanostructures. Compared to chiral molecules, the handedness of chiral nano- and microstructures can be directly established via electron microscopy, which can be utilized for the automatic analysis of chiral nanostructures and prediction of their properties. However, chirality in complex materials may have multiple geometric forms. Its identification from electron microscopy images rather than optical measurements is fundamentally challenging because (1) image features differentiating left- and right-handed particles can be ambiguous, and (2) three-dimensional structure essential for chirality is ‘flattened’ into two-dimensional projections typical for electron microscopy images. Here, we show that deep learning algorithms can identify twisted bowtie-shaped microparticles with nearly 100% accuracy and classify them as left- and right-handed with as high as 99% accuracy. Importantly, such accuracy was achieved with as few as 30 original electron microscopy images of bowties. Furthermore, after training on bowtie particles with complex nanostructured features, the model can recognize other chiral shapes with different geometries without re-training for their specific chiral geometry with 93% accuracy, indicating the learning abilities of the employed neural networks. These findings indicate that our deep learning algorithm trained on a practically feasible training set of experimental data enables automated analysis of microscopy data for the accelerated discovery of chiral particles and their complex systems for multiple applications.Abstract
#17
Super-Resolution of Turbulent Flows with Machine Learning
Andreas H. Rauch, MIDAS and Aerospace Engineering, University of Michigan
Anthony Carreon, Aerospace Engineering, University of Michigan
Venkat Raman, Aerospace Engineering, University of Michigan
Turbulent flows are critical to engineering applications, but numerical simulations of practical engineering devices cannot afford to resolve all turbulent features throughout the domain as the separation of scales between the device size, O(1) m, and the turbulences scales, O(1e-6)m, are vast. This motivates multiscale modeling approaches using Adaptive Mesh Refinement (AMR). AMR enables the efficient simulation of turbulent flow’s vast range of scales through targeted refinement of local features, resulting in coarse and fine grid levels. Using different solvers best suited to the local flow features provides high-fidelity and reduces computational cost. For example, employing a Direct Numerical Simulation (DNS) on the finest resolution grid levels and a turbulence modeling approach on the coarse resolution grid levels. However, AMR techniques initialize finer grid level data by interpolating from the coarse level data, which cannot generate the missing fine-scale turbulence required in the DNS subdomain. This work develops a novel deep learning approach to generate fine-scale turbulence from coarse scale simulations for super-resolution boundary conditions. Super-resolution techniques were first developed for computer vision applications and popular approaches include Convolutional Neural Networks and Generative Adversarial Networks. Here, a U-Net model with skip connections is trained to generate fine-scale homogeneous isotropic turbulence (HIT) from an input coarse turbulent field. The training and testing data is provided by the publicly available John Hopkins Turbulence Database for HIT at Re = 433, sampled at spatially and temporally uncorrelated planes. Turbulent kinetic energy spectra of the super-resolution reconstructed data clearly show the restoration of high wavenumber energy content with the U-Net model over the standard interpolation procedure. By supplementing existing physics-based solvers, this deep learning super-resolution approach can lead to increased simulation fidelity and reduced computational cost. This will help advance the development of the next generation of engineering combustion devices with reduced emissions.Abstract
#20
Accelerating Deep Learning in Reconstructive Spectroscopy with Device-Informed Data Simulation
Jiyi Chen, EECS, University of Michigan
Pengyu Li, EECS, University of Michigan
Yutong Wang, EECS, University of Michigan
Pei-Cheng Ku, EECS, University of Michigan
Qing Qu, EECS, University of Michigan
In this work, we develop a deep learning (DL)-based approach for sample-efficient, rapid-inference reconstructive spectroscopy. More precisely, we study a new problem setting for DL-based approach to spectroscopy where only device-informed simulated data are available for training. Device-informed simulated data are cheap to collect, but exhibit large distributional shift from non-simulated experimental data. To leverage such data, we develop a novel neural network architecture and data augmentation strategies to mitigate the adverse effect of such distributional shift. Compared to state-of-the-art optimization-based methods, our model exhibits comparable performance and achieves 30 speed-up during inference. We validate our approach on both simulated and non-simulated experimental datasets.Abstract
#22
Toward an Experimental Framework to Study Unsupervised Learning as an Analog for Evolvability in Genotype-Phenotype Maps
Matthew Andres Moreno, EEB, Complex Systems, and MIDAS, University of Michigan
Luis Zaman, EEB and Complex Systems, University of Michigan
The capability of an evolutionary substrate to generate novel phenotypic variation that is viable under mutation, referred to as evolvability, underpins the process of adaptive evolution. However, evolutionary simulations using models with high-dimensional phenotypes often exhibit stunted evolvability in the absence of indirect genotype-phenotype mapping that facilitates coordinated changes over many phenotypic traits. If these genotype-phenotype maps bias toward phenotypic viability and maintain phenotypic diversity, the resulting genetic search space will be lower-dimensional and less rugged, making it more conducive to adaptive evolution. Such evolvability in genotype-phenotype maps shares significant conceptual overlap with unsupervised learning, which extracts regularities and structure from unlabeled data that can enable lower-dimensional, compact representations of complex data like images and text. Here, we report a suite of benchmark fitness landscapes designed to facilitate head-to-head comparison of unsupervised learning techniques and evolved genotype-phenotype maps. This framework will contribute critical experimental rigor to ongoing efforts to harness unsupervised learning as a theoretical framework to understand evolvability. Exploration of unsupervised learning methods in engineering evolvable genotype-phenotype maps has great promise to benefit application-oriented evolutionary computation, as well.Abstract
#24
Floral Vision: Monitoring Plant-Pollinator Networks by Integrating Museum Data, Citizen Science and Computer Vision
Yutong Wang, Schmidt AI in Science Fellows, UMOR, and MIDAS, University of Michigan
James Boyko, Schmidt AI in Science Fellows, UMOR, and MIDAS, University of Michigan
Nathan Fox, Schmidt AI in Science Fellows, UMOR, and MIDAS, University of Michigan
Yiluan Song, Schmidt AI in Science Fellows, UMOR, and MIDAS, University of Michigan
Yu Zhou, Schmidt AI in Science Fellow, Cornell University
The interaction between pollinators and flowers is fundamental to ecosystem stability and has profound implications for global food production. Recent research underscores the disruptive effects of climate change on the synchrony between pollinator and flower phenologies (Memmott, 2007; Balfour et al., 2018). Yet, many of these studies are geographically constrained. Citizen science presents a potential avenue to broaden data collection across larger areas (Blasi et al., 2023). Consequently, there’s an urgent need for a sustainable, large-scale monitoring of plant-pollinator interactions. Towards this, we conduct a preliminary integrative analysis of plant-pollinator networks, incorporating three types of data: 1) geographic data on bee-flower range overlap, 2) high-quality, low-throughput museum phenological data, and 3) low-quality, high-throughput iNaturalist plant-pollinator interaction image data. Additionally, we introduce a method that employs foundation models for text-promptable image segmentation for increasing the available plant-pollinator interaction image data. To make the deployment of such large models on an “academic budget” feasible, we utilize knowledge distillation to reduce the computational cost of the method. Our work illustrates the potential of leveraging emerging AI technologies with well-established methodologies to fingerprint the biological consequences of climate change.Abstract
#26
Convolutional Neural Networks and Polyhedral Theory
Amirhossein Moosavi, MIDAS, University of Michigan
Onur Ozturk, Telfer School of Management, University of Ottawa
Rafid Mahmood, Telfer School of Management, University of Ottawa
Jonathan Patrick, Telfer School of Management, University of Ottawa
Background: Over the past decade, deep learning has emerged as the dominant approach for predictive modeling, primarily due to the exceptional accuracy achieved by deep neural networks. With the introduction of piecewise linear activation functions in deep learning, such as the Rectified Linear Unit, mixed-integer programming formulations of neural networks became possible and have become an area of interest. Neural network formulations are employed in decision-making problems for two main tasks: (i) optimizing an unknown objective function (for which historical data is available), or (ii) simplifying computationally expensive constraint sets. Contribution: Convolutional neural networks excel in capturing spatial hierarchies within data. They are made of complex mechanisms (e.g., convolution, pooling, flattening, and dense layers), each with various nuances like filter size, padding, stride size, and the number of channels. We propose the first explicit mixed-integer programming formulation of convolutional neural networks that captures their full generality for use in scientific research. Interdisciplinary Applications: Our work facilitates the application of convolutional neural networks via off-the-shelf commercial solvers, particularly by researchers from disciplines beyond machine learning. While conventional decision-making approaches often suffer from limited computational resources, we advance the learning-based optimization literature and propose the capacity to expedite solution processes for a wide range of applications. For example, we investigate a complex healthcare scheduling problem and demonstrate the usefulness of our formulation for a real-world decision-making problem. Our research idea can benefit from interdisciplinary collaborations in three primary ways: (i) demonstrating its potential for various applications, (ii) raising awareness among scholars from different disciplines about how one can leverage convolutional neural networks for decision-making problems, and (iii) proposing stronger formulations of such predictive models.Abstract
#29
FBOVI: An Efficient, Scalable and Flexible Variational Approach for Online State-Parameter Estimation of Partially Observable Systems
Liliang Wang, Aerospace Engineering, University of Michigan
Alex Gorodetsky, Aerospace Engineering, University of Michigan
Online inference which learns the parameters and states of a system as the system is evolving is a crucial part of real-time prediction and decision making tasks. An online inference method which is well-suited for real-time tasks should be efficient, robust and scalable to provide rapid response, guarantee safety and handle problems of different scales. In recent years, some online inference methods have been developed. However, many of these methods either are unreliable or scale poorly with high-dimensional problems. There exist a limited amount of robust and scalable methods but they often have high computational complexity. We propose a factorization-based online variational inference approach (FBOVI) which assimilates data incrementally and provides an approximate joint posterior distribution of system parameters and states. At each time step, only the newly received data is assimilated to update the joint distribution, which keeps the computational cost for each step constant. Our method is based on a particular factorization of the joint posterior which allows for minimal assumption of the joint posterior. The marginal distribution over parameters can be learnt using an arbitrary representation family. Moreover, the representation form for the moments of the variational distribution of current state given system parameters, which are functions of the parameters, can be chosen to fulfill the requirements of speed or performance. This facilitates our method to provide an accurate approximation to the joint posterior distribution in a flexible way. The efficacy of the proposed method is demonstrated by applications in different engineering fields including low- and high-dimensional partially observable dynamical systems.Abstract
#31
Don’t bother to make it, just fine-tune it!
O Hwang Kwon, Nuclear Engineering and Radiological Science, University of Michigan
Majdi Radaideh, Nuclear Engineering and Radiological Science, University of Michigan
Today, large-scale language models are being released at an unprecedented rate. However, users of these models have diverse purposes for using them. In such cases, instead of creating different models each time, we can fine-tune existing large language models to serve our specific needs. In our research team, we selected the three most iconic large language models(BERT,GPT,LLAMA2) available today and trained them on tweets related to nuclear power. Through this process, we are constructing a open-source Fine-Tuning Package so that people can utilize LLM for their own purposes.Abstract
#2
Towards fairer AI: an inference-based, model-agnostic and threshold-free bias detection pipeline inspired by individual fairness
Bernardo Modenesi, MIDAS, University of Michigan
Lucia Wang, Rocket Companies
Ameya Diwan, Rocket Companies
Individual fairness methods depart from the idea that similar observations should be treated similarly by a machine learning model, circumventing some of the shortcomings of group fairness tools. Nevertheless, many existing individual fairness approaches are either tailored to specific models or rely on a series of ad hoc decisions to determine model bias. In this paper, we propose an individual fairness-inspired, inference-based bias detection pipeline. Our method is model-agnostic, suited for all data types, avoids commonly used ad-hoc thresholds and decisions, and provides an intuitive scale to indicate how biased the assessed model is. We propose a model ensemble approach for our bias detection tool, consisting of: (i) building a proximity matrix with random forests based on features and output; (ii) inputting it into a bayesian network method to cluster similar observations; (iii) performing within-cluster inference to test the hypothesis that the model is treating similar observations similarly; and (iv) aggregating the cluster tests up with multiple hypothesis test correction. In addition to providing a single statistical p-value for the null hypothesis that the model is unbiased based on individual fairness, we further create a scale that measures the amount of bias against minorities carried by the model of interest, making the overall p-value more interpretable to decision-makers. We apply our methodology to assess bias in the mortgage industry, and we provide an open-source Python package for our methods.Abstract
#4
Algorithmic audit of targeted advertising: a sociotechnical perspective
Lu Xian, School of Information, University of Michigan
Matt Bui, School of Information, University of Michigan
Abigail Jacobs, School of Information, University of Michigan
Targeted ads are important sources of information that shape how individuals get access to various opportunities and resources. Targeted advertising also introduces bias and discrimination against groups of users along the lines of race, gender, and so on. Audits of targeted advertising algorithms in various domains like employment have provided evidence for gender-based and race-based discrimination in ad delivery by focusing on the association between user profile and ad type and content. Building on prior work, our work provides an account for how biases reflected in targeted advertising reinforce historical inequalities in the housing and mortgage sectors in metropolitan cities. To demonstrate, we collected web traffic data within Google’s search engine results page at the zip code level in New York City in 2020 from a third party vendor. We combined the data with recently released appraisal record data at the census tract level, which serves as a proxy for home values. By triangulating those datasets with census data, we analyze how communities are differentially targeted using spatial analysis, clustering, and regression models. This work reveals how biased distribution of information and resources impacts individuals’ access to economic opportunities.Abstract
#6
Flowing With The Tide: How Online Partisans Respond to Realignment of Political Coalitions at the Provincial Level in India
Anmol Panda, School of Information, University of Michigan
Nate TeblunThuis, School of Information, University of Michigan
Libby Hemphill, School of Information, University of Michigan
Ceren Budak, School of Information, University of Michigan
Social media have been widely used for election campaigning in the past two decades. Prior work on political communication in India has focused mainly on social media activity and interactions between political elites at the national level. Yet, social media are also significant in state-level Indian politics in ways that have been largely overlooked in scholarship. For instance, coalitions among parties are an important factor in Indian electoral politics where intra-party relations among state-level parties may differ from those at the national level. Our study fills this void with a descriptive and causal analysis of how an electoral alliance’s break-up shaped its supporters’ behavior on social media and how they engage with parties and politicians. We use data from the four major parties in the Indian state of Maharashtra — the BJP, the Shiv Sena, the INC, and the NCP — during the 2019-2020 period, to analyze how online partisans responded to the formation of an alliance between the latter three parties. We test our hypotheses using two methods: (1) a partisanship model based on a vectorized representation of users’ liking behavior, and (2) proximity to parties in the retweet network of users. Based on how users’ partisan lean and their retweet proximity to parties change in response to the coalition realignment, we find that supporters of the Shiv Sena became more aligned with the INC/NCP, and away from their traditional ally — the BJP. In contrast with European settings, our findings suggest that partisan loyalties remained largely intact when ideologically divergent coalitions were formed in India. Finally, we find that among non-elites in India, party loyalty was a stronger predictor of social media liking behavior, compared to ideological preferences.Abstract
#8
Peer-on-peer surveillance: ethical tensions of anonymous reporting systems in K-12 schools
Elyse Thulin, MIDAS and Michigan Medicine Psychiatry, University of Michigan
Justin Heinze, Health Behavior Health Education, University of Michigan
Advances in data science, computation, and Ai are expanding the presence and reach of surveillance systems in society. While there is existing literature concerning school-based surveillance of students by educators and school systems, there has been limited attention given to the ethical implications of anonymous report systems which leverage student reporting on themselves and their peers. This is concerning given that 50% of K-12 schools employ one or more anonymous reporting systems (ARS) to enhance safety and monitor potential risks for students, educators, and the broader school community. These ARS platforms operate on a “see-something-say-something” principle, encouraging students to share their knowledge of concerning behaviors or events, which is then forwarded to the respective school, emergency and law enforcement authorities. Information submitted through these systems may encompass identifying details (such as names), socio-demographic data (including race and gender), health-related information (such as suicidality), and behavioral observations (such as carrying a weapon). While ARS submissions have demonstrated efficacy in preventing school shootings and adolescent suicides, there are anecdotal instances where ARS unintentionally exposed information about a student to the wider community. Beyond unintentional exposure, data collected through state-owned ARS could potentially be exploited by school systems in ways not originally proposed, as has been evident in other school-surveillance programs such as Ai assisted social media monitoring. Further, as ARS uses a peer-report system and leverages the in-depth knowledge that peers have of one another, unintended use of ARS could further elevate risk of victimization for BIPOC students, LGBTQIA+ students, and those seeking reproductive healthcare. This presentation aims to elucidate the dichotomy between the benefits offered by ARS and peer-reported systems and the genuine risks faced by youth who submit reports and are subject to them.Abstract
#11
Metabolism-inspired Mechanistic Deep Learning for Treating Drug-resistant Infections
Harkirat Singh Arora, Biomedical Engineering, University of Michigan
Sriram Chandrasekaran, Biomedical Engineering, University of Michigan
Summary: Antibiotic resistance (AR) is a pressing global health concern; new treatments are urgently needed. Drug combinations are a promising solution to this problem, but they are designed empirically, driven by clinical intuition, leading to suboptimal results and increased AR. The combinatorial explosion further aggravates the problem. Therefore, there is a need for an efficient data-driven approach, to facilitate the development of these treatments. We have developed a mechanistic approach that combines multi-omics data with artificial neural networks (ANN) to (a) accurately predict multi-way drug interactions, (b) accommodate drug-resistant bacterial strains, (c) gain insights into a diverse set of pathways predictive of drug combinations at the molecular level, and (d) accounts for drug toxicity profiles to design safer treatments. Methods: The approach involves a three-step process, (1) generating feature profiles for individual drug treatments using multi-omics data such as chemogenomics and transcriptomics followed by (2) preprocessing to compute joint profiles for a combination accounting for similarity and uniqueness among drugs in the treatment, and (3) feeding the information to the metabolism-inspired neural network algorithm for model development, and evaluating performance. Main results: The mechanistic approach accurately and significantly predicted multi-drug interactions in E. coli (R = 0.58, p~10-14) and M. tuberculosis (R = 0.42, p~10-8). It was extended to predict drug combination outcomes in critical pathogens, S. typhimurium and P. aeruginosa. It identified Alternate Carbon Metabolism in E. coli and Fatty Acid Metabolism in M. tuberculosis along with Transport mechanisms in both, as the most important pathways governing drug resistance, in agreement with literature evidence. Impact: As novel treatments are not readily discovered, it is crucial to design treatments using approved FDA drugs. Our developed algorithm aims to provide an innovative and unique perspective on utilizing machine learning in guiding the development of multi-drug treatments using FDA drugs.Abstract
#12
Hippocampal calcium activity patterns underlying exploratory behavior in mice: Neural decoding and correlated neural network analysis
Swapnil Gavade, Michigan Neuroscience Institute, University of Michigan
Shany Yang, Michigan Neuroscience Institute, University of Michigan
Joanna-Spencer Segal, Michigan Neuroscience Institute and Internal Medicine, University of Michigan
Exploratory behaviors in mice are used to obtain information about the emotional or cognitive state of the animal. Dorsal hippocampus is one brain region that controls exploratory behavior. Here we developed methods based on neural decoding and correlated neural networks to identify population dynamics of dorsal CA1 neurons underlying specific exploratory behaviors. Calcium activity of dorsal CA1 neurons was recorded using a miniature microscope in freely behaving mice during different exploratory tasks: bright open field exploration, and familiar or novel object exploration. We identified populations of neurons with activity linked to risk-taking behavior in the open field (“center” cells), and neurons linked to object exploration by taking the ratio of the average calcium amplitude in “center” to “not center” , “object exploration” to “no object exploration” time period. Neurons with a ratio greater than the 99th percentile and less than the 1st percentile of shuffle data distribution were considered behavior-sensitive neurons. The neural correlation analysis of behavior sensitive neurons revealed that behavior-sensitive neurons showed highly correlated activity during their respective behaviors; for example, center-sensitive neurons showed highly correlated activity during center exploration but not at other times. We implemented a neural decoding method based on support vector machine, supervised learning where the model where trained using behavior sensitive neurons. The proposed neural decoding method is able to predict neural activity from various behavioral states, and prior knowledge about behavioral sensitive neurons can improve the classification of neural activity. Together, these methods reveal meaningful neural activity patterns during relevant hippocampal-dependent behaviors.Abstract
#14
Integrating Machine Learning and Molecular Modeling with Directed Evolution
Azam Hussain, Macromolecular Science and Engineering, University of Michigan
Charles L. Brooks III, Chemistry and Biophysics, University of Michigan
Directed evolution is a widespread technique that enables the acquisition of novel functionalities and highly efficient variants of natural protein sequences. This process involves rounds of large-scale sequence expression and specifically designed activity assays. While effective, the approach requires significant time and resources for extensive explorations of sequence-function spaces and must be optimized for new target proteins. Recent endeavors have displayed that integrating machine-learning (ML) with directed evolution delivers a rapid, targeted exploration of sequence-function spaces. In this study, we incorporate a molecular modeling (MM) pipeline with machine-learning directed evolution (MLDE), aiming to minimize the necessary mutations and amplify the hit rate of observed mutations. Our MM+MLDE protocol facilitated the identification of vital second-sphere residues that modulate enantioselectivity and reactivity in fungal flavin-dependent monoxygenases, utilizing a compact set of targeted site-directed mutants. We further discuss how to integrate active learning into successive rounds of expression results by training convolutional neural networks on predicted protein-ligand structures against the experimental data, allowing for more accurate predictions for subsequent variant sets. It is hoped that our innovative protocols will encourage the broader incorporation of MM and ML in directed evolution campaigns.Abstract
#16
Fast and Compressed Deep Linear Networks for Learning Low-Dimensional Models
Soo Min Kwon, EECS, University of Michigan
Zekai Zhang, EE, Tsinghua University
Dogyoon Song, EECS, University of Michigan
Laura Balzano, EECS, University of Michigan
Qing Qu, EECS, University of Michigan
Deep linear networks have proven to be powerful models for solving low-rank matrix recovery problems. Their effectiveness is partially attributed to the implicit bias in the learning dynamics of over-parameterized models, which favors certain low-rank solutions that generalize well. However, such over-parameterization often leads to a substantial increase in computational complexity, limiting their applicability to real-world problems at scale. In this paper, we propose a simple, yet effective technique to compress deep linear networks, which significantly reduces computational overheads without compromising model quality. Our approach involves projecting the deep linear network onto carefully constructed low-dimensional subspaces, drawing inspiration from its learning dynamics. Remarkably, our compressed network converges faster than the original network, consistently yielding smaller recovery errors throughout all iterations of gradient descent. We substantiate this observation by developing theory focused on deep matrix factorization problem, and by conducting empirical evaluations on two canonical matrix recovery problems: matrix sensing and completion. Further, we demonstrate how the use of compressed network can improve the generalization of deep nonlinear networks as well. Overall, we observed that our compression technique accelerates the training process by more than $2\times$ across a broad range of problems.Abstract
#19
An accurate and efficient deep learning model for traffic light handling in V2X-compromised situations
Daphne Tsai, Computer Science and Engineering, University of Michigan
An accurate and robust traffic light handling model is crucial for the safe and efficient operation of autonomous vehicles (AVs). Current L4 AVs on the market leverage Vehicle-to-Everything (V2X) infrastructure in the form of Signal Phase and Timing (SPaT) messages to receive crucial information for path planning, such as states of traffic lights and how long they will stay in those states. However, current state-of-the-art perception systems, particularly those with a heavy emphasis on V2X infrastructure and SPaT messages, have several limitations, such as missing SPaT messages or incorrect SPaT messages. In these situations, deep learning models can be utilized to handle predictions more accurately and efficiently. To create a more robust traffic light handling system, a deep learning model was developed by leveraging semantic segmentation algorithms to detect and classify traffic lights. A variety of architectures were tested to achieve a model that performed with high accuracy on proprietary testing data. The deep learning model was ultimately integrated with the existing V2X infrastructure, and a control flow logic was created to evaluate the perception model under both ideal and realistic conditions for accuracy and efficiency. By developing an accurate and efficient deep learning model to handle situations that lack accurate SPaT messages, a more accurate perception model was achieved that further improves the safety of an autonomous vehicle.Abstract
#21
A Geometric Analysis of Multi-label Learning under Pick-all-label Loss via Neural Collapse
Pengyu Li, EECS, University of Michigan
Yutong Wang, EECS, University of Michigan
Xiao Li, EECS, University of Michigan
Qing Qu, EECS, University of Michigan
In this study, we explore multi-label learning, an important subfield of supervised learning that aims to predict multiple labels from a single input data point. This research investigates the training of deep neural networks for multi-label learning through the lens of neural collapse, an intriguing phenomenon that occurs during the terminal phase of training. Previously, neural collapse (NC) has been investigated both theoretically and empirically in the context of multi-class classification. For last-layer features, it has been demonstrated that (i) the variability of features within classes collapses to zero, and (ii) the feature means between classes become maximally and equally separated. In this work, we demonstrate that the NC phenomenon can be extended to multi-label learning, revealing that the “pick-all-label” training formulation for multi-label learning exhibits the NC phenomenon in a more general context. Specifically, under the natural analog of the unconstrained feature model, we establish that the only global minimizers of the pick-all-label loss display the same equi-angular tight frame (ETF) geometry. Additionally, scaled average of the ETF are used to represent the features of samples with multiple labels. We also provide empirical evidence to support our investigation into training deep neural networks on multi-label datasets, resulting in improved training efficiency.Abstract
#23
Deep-neural networks as a tool for advancing evolutionary analysis
James Boyko, MIDAS and Ecology & Evolutionary Biology, University of Michigan
Dan Rabosky, Ecology & Evolutionary Biology, University of Michigan
Advances in 3D imaging technologies have given scientists access to terabyte-scale data on the anatomical structure (morphology) of organisms. In many ways, this exponential rise in data on organismal form parallels the orders-of-magnitude rise in genomic data availability. However, existing analytical and computational frameworks cannot cope with data at this size. In fact, it is not even clear how to represent this data such that it can be analyzed at all. Nonetheless, identifying key evolutionary features of animal form is an essential question for biologists and these high-dimensional data are highly precise representations of morphological diversity. Therefore, there exists a pressing need to address two primary questions: (1) how do we represent such complex, multidimensional data, and (2) how can we learn about the process which produced the observed form? By addressing these questions we would gain novel insights into evolutionary processes which have given rise to the tremendous diversity of living organisms that we see today. My work as a Schmidt Futures AI in Science Post-doctoral fellow is aimed at using deep-neural networks as a framework for understanding major evolutionary transitions in animal body forms. These neural networks are still in their nascent stages, but show tremendous promise for outperforming traditional methods in a variety of downstream tasks of interest to evolutionary biologists and ecologists. For example, preliminary results suggest that the non-linear representation of complex form provided by deep learning methods can more accurately estimate extinct forms than traditional methods, even when fossils are not included in the training dataset. For the first time, these methods will enable us to understand how evolution generated major innovations (e.g., novel morphologies and specializations) in a “data space” that is closely aligned with the complex geometry of real animals.Abstract
#25
Safeguarding Biodiversity: Harnessing Social Media and AI for Conservation
Nathan Fox, Research Fellow, MIDAS, University of Michigan
Enrico Di Minin, Geosciences and Geography, University of Helsinki
Neil Carter, School for Environment and Sustainability, University of Michigan
Sabina Tomkins, School of Information, University of Michigan
Derek Van Berkel, School for Environment and Sustainability, University of Michigan
Anthropogenic pressures are causing unparalleled declines in biodiversity. Environmental resilience is closely intertwined with the protection and advancement of biodiversity across various scales, ranging from local ecosystems to the global biosphere. Biodiversity plays a pivotal role in enhancing the ability of ecosystems to endure and rebound from a variety of disturbances. Addressing this worldwide biodiversity crisis necessitates concentrated conservation initiatives, which, unfortunately, are constrained by substantial costs and limited resources. The absence of comprehensive biodiversity data can obscure declines in populations and potential extinctions. Consequently, there is an immediate requirement for cost-effective and scalable methods for monitoring biodiversity. One promising avenue in this endeavour is role of social media platforms. Social media enables the integration of diverse data types, including images, text, audio, and video, through multimodal approaches, thereby reshaping the landscape of conservation research. By harnessing AI technologies such as computer vision, natural language processing, and spatial-temporal analysis, we can extract valuable insights from social media posts. This multimodal approach to biodiversity monitoring introduces innovative possibilities, such as tracking changes in the timing and distribution of biodiversity events and monitoring areas impacted by invasive species. These insights can, in turn, facilitate the development of efficient and large-scale conservation strategies, contributing significantly to the augmentation of environmental resilience.Abstract
#28
An Incremental Tensor Train Decomposition for High-Dimensional Data Streams
Doruk Aksoy, Aerospace Engineering and Scientific Computing, University of Michigan
Alex A. Gorodetsky, Aerospace Engineering, University of Michigan
We present a new algorithm for incrementally updating the tensor train decomposition of a stream of tensor data. This new algorithm, called the tensor train incremental core expansion (TT-ICE) improves upon the current state-of-the-art algorithms for compressing in tensor train format by developing a new adaptive approach that incurs significantly slower rank growth and guarantees compression accuracy. This capability is achieved by limiting the number of new vectors appended to the TT-cores of an existing accumulation tensor after each data increment. These vectors represent directions orthogonal to the span of existing cores and are limited to those needed to represent a newly arrived tensor to a target accuracy. We provide two versions of the algorithm: TT-ICE and TT-ICE accelerated with heuristics (TT-ICE∗). We empirically demonstrate the performance of the algorithms in compressing large-scale video and scientific simulation datasets. Compared to existing approaches that also use rank adaptation, TT-ICE∗ achieves 57× higher compression and up to 95% reduction in computational time.Abstract
#30
heliostack: A Novel Approach to the Minor Planet Detection Problem
Kevin J Napier, MIDAS and Physics, University of Michigan
Hsing-Wen Lin, Physics, University of Michigan
David W Gerdes, Physics and Astronomy, University of Michigan
Recent advances in computing and camera technology have pushed astronomical surveys into the realm of big data. Beginning in 2023, the Legacy Survey of Space and Time (LSST) will image half of the sky every three nights, for 300 nights per year. The resulting dataset will have immense potential for discovering solar system bodies, especially if we can manage to discover objects fainter than the single-image threshold. The traditional method for finding sub-threshold solar system bodies is a technique called shift-and-stack, in which one stacks images along the orbit of a moving object. When the signal from the object is added coherently in enough images, it becomes detectable. Shift-and-stack searches have only ever been accomplished in sets of images taken in a single night, where the apparent motion of the solar system bodies remains linear. When the apparent motion becomes nonlinear, the number of potential trajectories grows exponentially. If we remain constrained to time scales of a single night, our ability to find faint objects will be limited by the size of our telescopes. In order to make the most effective use of our data, we are developing a technique (heliostack) that uses software to combine images taken days, weeks, or even months apart. Our technique combines expert domain knowledge with recent advancements in computing and AI technology in order to overcome the computational and logistical difficulties of the search. It will be critical to develop highly efficient convolutional neural nets for rejecting false positive detections. If this project succeeds, the potential payoff is an increase in LSST’s yield of solar system objects by as much as an order of magnitude. Such a catalog would provide invaluable insight about the formation of solar systems, the possible presence of yet-undetected planets in our own solar system, and much more.Abstract