His research interest lies in the intersection of signal processing, data science, machine learning, and numerical optimization. He is particularly interested in computational methods for learning low-complexity models from high-dimensional data, leveraging tools from machine learning, numerical optimization, and high dimensional geometry, with applications in imaging sciences, scientific discovery, and healthcare. Recently, he is also interested in understanding deep networks through the lens of low-dimensional modeling.
My research focuses on building infrastructure for public health and health science research organizations to take advantage of cloud computing, strong software engineering practices, and MLOps (machine learning operations). By equipping biomedical research groups with tools that facilitate automation, better documentation, and portable code, we can improve the reproducibility and rigor of science while scaling up the kind of data collection and analysis possible.
Research topics include:
1. Open source software and cloud infrastructure for research,
2. Software development practices and conventions that work for academic units, like labs or research centers, and
3. The organizational factors that encourage best practices in reproducibility, data management, and transparency
The practice of science is a tug of war between competing incentives: the drive to do a lot fast, and the need to generate reproducible work. As data grows in size, code increases in complexity and the number of collaborators and institutions involved goes up, it becomes harder to preserve all the “artifacts” needed to understand and recreate your own work. Technical AND cultural solutions will be needed to keep data-centric research rigorous, shareable, and transparent to the broader scientific community.
The Ahmed lab studies behavioral neural circuits and attempts to repair them when they go awry in neurological disorders. Working with patients and with transgenic rodent models, we focus on how space, time and speed are encoded by the spatial navigation and memory circuits of the brain. We also focus on how these same circuits go wrong in Alzheimer’s disease, Parkinson’s disease and epilepsy. Our research involves the collection of massive volumes of neural data. Within these terabytes of data, we work to identify and understand irregular activity patterns at the sub-millisecond level. This requires us to leverage high performance computing environments, and to design custom algorithmic and analytical signal processing solutions. As part of our research, we also discover new ways for the brain to encode information (how neurons encode sequences of space and time, for example) – and the algorithms utilized by these natural neural networks can have important implications for the design of more effective artificial neural networks.
My research interests are in natural language semantics and psycholinguistics, focusing on verbs. I conduct behavioral psycholinguistic experiments with methodologies such as self-paced reading and maze tasks, as well as surveys of linguistic and semantic judgments. I also study semantic variation using corpora and datasets such as the Twitter Decahose, to better understand how words have developed diverging meanings in different communities, age groups, or regions. I use primarily R and Python to collect, manage, and analyze data. I direct the UM WordLab in the linguistics department, working with students (especially undergraduates) on experimental and computational research focusing on lexical representations.
Study of Pandemic Publishing: How Scholarly Literature is Affected by COVID-19 Pandemic
This project addresses the quality of recently published COVID-19 publications. With the COVID-19 pandemic, researchers publish a lot their research as preprints. And while preprints are an important development in scholarly publishing, they are works in progress that need further refinement to become a more rigorous final product. Scholarly publishers are also taking initiatives to accelerate publication process, for example, by asking reviewers to curtail requests for additional experiments upon revisions. Sacrificing rigor for haste inevitably increases the likelihood of article correction and retraction, leading to spread of false information within supposedly trustworthy sources that have a peer-reviewing process in place to ensure proper verification. I study the quality of COVID-19 related scholarly works by using CADRE’s datasets to identify signs of incoherency, irreproducibility, and haste.
We have developed and tested machine learning approaches to integrate quantitative markers for diagnosis and assessment of progression of TMJ OA, as well as extended the capabilities of 3D Slicer4 into web-based tools and disseminated open source image analysis tools. Our aims use data processing and in-depth analytics combined with learning using privileged information, integrated feature selection, and testing the performance of longitudinal risk predictors. Our long term goals are to improve diagnosis and risk prediction of TemporoMandibular Osteoarthritis in future multicenter studies.
The Spectrum of Data Science for Diagnosis of Osteoarthritis of the Temporomandibular Joint
Greg’s research primarily investigates information flow in financial markets and the actions of agents in those markets – both consumers and producers of that information. His approach draws on theory from the social sciences (economics, psychology and sociology) combined with large data sets from diverse sources and a variety of data science approaches. Most projects combine data from across multiple sources, including commercial data bases, experimentally created data and extracting data from sources designed for other uses (commercial media, web scrapping, cellphone data etc.). In addition to a wide range of econometric and statistical methods, his work has included applying machine learning , textual analysis, mining social media, processes for missing data and combining mixed media.
The long temporal and large spatial scales of ecological systems make controlled experimentation difficult and the amassing of informative data challenging and expensive. The resulting sparsity and noise are major impediments to scientific progress in ecology, which therefore depends on efficient use of data. In this context, it has in recent years been recognized that the onetime playthings of theoretical ecologists, mathematical models of ecological processes, are no longer exclusively the stuff of thought experiments, but have great utility in the context of causal inference. Specifically, because they embody scientific questions about ecological processes in sharpest form—making precise, quantitative, testable predictions—the rigorous confrontation of process-based models with data accelerates the development of ecological understanding. This is the central premise of my research program and the common thread of the work that goes on in my laboratory.