My lab has two main areas of focus: molecular characteristics of head and neck cancer, and the intersection of regulatory genomics and pathway analysis. With head and neck cancer, we study tumor subtypes and biomarkers of prognosis, treatment response, and recurrence. We perform integrative omics analyses, dimension reduction methods, and prediction techniques, with the ultimate goal of identifying patient subsets who would benefit from either an additional targeted treatment or de-escalated treatment to increase quality of life. For regulatory genomics and pathway analysis, we develop statistical tests taking into account important covariates and other variables for weighting observations.
The Aguilar group is focused understanding transcriptional and epigenetic mechanisms of skeletal muscle stem cells in diverse contexts such as regeneration after injury and aging. We focus on this area because there are little to no therapies for skeletal muscle after injury or aging. We use various types of in-vivo and in-vitro models in combination with genomic assays and high-throughput sequencing to study these molecular mechanisms.
My methodological research focus on developing statistical methods for routinely collected healthcare databases such as electronic health records (EHR) and claims data. I aim to tackle the unique challenges that arise from the secondary use of real-world data for research purposes. Specifically, I develop novel causal inference methods and semiparametric efficiency theory that harness the full potential of EHR data to address comparative effectiveness and safety questions. I develop scalable and automated pipelines for curation and harmonization of EHR data across healthcare systems and coding systems.
My research interests are in natural language semantics and psycholinguistics, focusing on verbs. I conduct behavioral psycholinguistic experiments with methodologies such as self-paced reading and maze tasks, as well as surveys of linguistic and semantic judgments. I also study semantic variation using corpora and datasets such as the Twitter Decahose, to better understand how words have developed diverging meanings in different communities, age groups, or regions. I use primarily R and Python to collect, manage, and analyze data. I direct the UM WordLab in the linguistics department, working with students (especially undergraduates) on experimental and computational research focusing on lexical representations.
Anthony Vanky develops and applies data science and computational methods to design, plan, evaluate cities, emphasizing their applications to urban planning and design. Broadly, his work focuses on the domains of transportation and human mobility; social behaviors and urban space; policy evaluation; quantitative social sciences; and the evaluation of urban form. Through this work, he has extensively collaborated with public and private partners. In addition, he considers creative approaches toward data visualization, public engagement and advocacy, and research methods.
Anthony Vanky’s Cityways project analyzed 2.2 million trips from 135,000 people over one year to understand the factors that influence outdoor pedestrian path choice. Factors considered included weather, urban morphology, businesses, topography, traffic, the presence of green spaces, among others.
We are interested in resolving outstanding fundamental scientific problems that impede the computational materials design process. Our group uses high-throughput density functional theory, applied thermodynamics, and materials informatics to deepen our fundamental understanding of synthesis-structure-property relationships, while exploring new chemical spaces for functional technological materials. These research interests are driven by the practical goal of the U.S. Materials Genome Initiative to accelerate materials discovery, but whose resolution requires basic fundamental research in synthesis science, inorganic chemistry, and materials thermodynamics.
Professor Manduca’s research focuses on urban and regional economic development, asking why some cities and regions prosper while others decline, how federal policy influences urban fortunes, and how neighborhood social and economic conditions shape life outcomes. He studies these topics using computer simulations, spatial clustering methods, network analysis, and data visualization.
In other work he explores the consequences of rising income inequality for various aspects of life in the United States, using descriptive methods and simulations applied to Census microdata. This research has shown how rising inequality has lead directly to lower rates of upward mobility and increases in the racial income gap.
Screenshot from “Where Are The Jobs?” visualization mapping every job in the United States based on the unemployment insurance records from the Census LODES data. http://robertmanduca.com/projects/jobs.html
Study of Pandemic Publishing: How Scholarly Literature is Affected by COVID-19 Pandemic
This project addresses the quality of recently published COVID-19 publications. With the COVID-19 pandemic, researchers publish a lot their research as preprints. And while preprints are an important development in scholarly publishing, they are works in progress that need further refinement to become a more rigorous final product. Scholarly publishers are also taking initiatives to accelerate publication process, for example, by asking reviewers to curtail requests for additional experiments upon revisions. Sacrificing rigor for haste inevitably increases the likelihood of article correction and retraction, leading to spread of false information within supposedly trustworthy sources that have a peer-reviewing process in place to ensure proper verification. I study the quality of COVID-19 related scholarly works by using CADRE’s datasets to identify signs of incoherency, irreproducibility, and haste.
We have developed and tested machine learning approaches to integrate quantitative markers for diagnosis and assessment of progression of TMJ OA, as well as extended the capabilities of 3D Slicer4 into web-based tools and disseminated open source image analysis tools. Our aims use data processing and in-depth analytics combined with learning using privileged information, integrated feature selection, and testing the performance of longitudinal risk predictors. Our long term goals are to improve diagnosis and risk prediction of TemporoMandibular Osteoarthritis in future multicenter studies.
The Spectrum of Data Science for Diagnosis of Osteoarthritis of the Temporomandibular Joint
As a board-certified ophthalmologist and glaucoma specialist, I have more than 15 years of clinical experience caring for patients with different types and complexities of glaucoma. In addition to my clinical experience, as a health services researcher, I have developed experience and expertise in several disciplines including performing analyses using large health care claims databases to study utilization and outcomes of patients with ocular diseases, racial and other disparities in eye care, associations between systemic conditions or medication use and ocular diseases. I have learned the nuances of various data sources and ways to maximize our use of these data sources to answer important and timely questions. Leveraging my background in HSR with new skills in bioinformatics and precision medicine, over the past 2-3 years I have been developing and growing the Sight Outcomes Research Collaborative (SOURCE) repository, a powerful tool that researchers can tap into to study patients with ocular diseases. My team and I have spent countless hours devising ways of extracting electronic health record data from Clarity, cleaning and de-identifying the data, and making it linkable to ocular diagnostic test data (OCT, HVF, biometry) and non-clinical data. Now that we have successfully developed such a resource here at Kellogg, I am now collaborating with colleagues at > 2 dozen academic ophthalmology departments across the country to assist them with extracting their data in the same format and sending it to Kellogg so that we can pool the data and make it accessible to researchers at all of the participating centers for research and quality improvement studies. I am also actively exploring ways to integrate data from SOURCE into deep learning and artificial intelligence algorithms, making use of SOURCE data for genotype-phenotype association studies and development of polygenic risk scores for common ocular diseases, capturing patient-reported outcome data for the majority of eye care recipients, enhancing visualization of the data on easy-to-access dashboards to aid in quality improvement initiatives, and making use of the data to enhance quality of care, safety, efficiency of care delivery, and to improve clinical operations. .