Omid Dehzangi

By | | No Comments

Wearable health technology is drawing significant attention for good reasons. The pervasive nature of such systems providing ubiquitous access to the continuous personalized data will transform the way people interact with each other and their environment. The resulting information extracted from these systems will enable emerging applications in healthcare, wellness, emergency response, fitness monitoring, elderly care support, long-term preventive chronic care, assistive care, smart environments, sports, gaming, and entertainment which create many new research opportunities and transform researches from various disciplines into data science which is the methodological terminology for data collection, data management, data analysis, and data visualization. Despite the ground-breaking potentials, there are a number of interesting challenges in order to design and develop wearable medical embedded systems. Due to limited available resources in wearable processing architectures, power-efficiency is demanded to allow unobtrusive and long-term operation of the hardware. Also, the data-intensive nature of continuous health monitoring requires efficient signal processing and data analytic algorithms for real-time, scalable, reliable, accurate, and secure extraction of relevant information from an overwhelmingly large amount of data. Therefore, extensive research in their design, development, and assessment is necessary. Embedded Processing Platform Design The majority of my work concentrates on designing wearable embedded processing platforms in order to shift the conventional paradigms from hospital-centric healthcare with episodic and reactive focus on diseases to patient-centric and home-based healthcare as an alternative segment which demands outstanding specialized design in terms of hardware design, software development, signal processing and uncertainty reduction, data analysis, predictive modeling and information extraction. The objective is to reduce the costs and improve the effectiveness of healthcare by proactive early monitoring, diagnosis, and treatment of diseases (i.e. preventive) as shown in Figure 1.

dehzangi-image

Embedded processing platform in healthcare

mebane-small

Walter Mebane

By | | No Comments

My primary project, election forensics, concerns using statistical analysis to try to determine whether election results are accurate.  Election forensics methods use data about voters and votes that are as highly disaggregated as possible.  Typically this means polling station (precinct) data, sometimes ballot box data.  Data can comprises hundreds of thousands or millions of observations.  Geographic information is used, with geographic structure being relevant.  Estimation involves complex statistical models.  Frontiers include:  distinguishing frauds from effects of strategic behavior;  estimating frauds probabilities for individual observations (e.g., polling stations);  adjoining nonvoting data such as from in-person election observations.

Hotspot Analysis, Extreme Fraud Probabilities, South Africa, 2014

Hotspot Analysis, Extreme Fraud Probabilities, South Africa, 2014

johnsonroberson-small

Matthew Johnson-Roberson

By | | No Comments

The increasing economic and environmental pressures facing the planet require cost-effective technological solutions to monitor and predict the health of the earth. Increasing volumes of data and the geographic dispersion of researchers and data gathering sites has created new challenges for computer science. Remote collaboration and data abstraction offer the promise of aiding science for great social benefit. My research in this field has been focused on developing novel methods for the visualization and interpretation of massive environments from multiple sensing modalities and creating abstractions and reconstructions that allow natural scientists to predict and monitor the earth through remote collaboration. Through the promotion of these economically efficient solutions, my work aims to increase access to hundreds of scientific sites instantly without traveling. In undertaking this challenge I am constantly aiming to engage in research that will benefit society.

Traditional marine science surveys will capture large amounts of data regardless of the contents or the potential value of the data. In an exploratory context, scientists are typically interested in reviewing and mining data for unique geological or benthic features. This can be a difficult and time consuming task when confronted with thousands or tens of thousands of images. The technique shown here uses information theoretic methods to identify unusual images within large data sets.

Traditional marine science surveys will capture large amounts of data regardless of the contents or the potential value of the data. In an exploratory context, scientists are typically interested in reviewing and mining data for unique geological or benthic features. This can be a difficult and time consuming task when confronted with thousands or tens of thousands of images. The technique shown here uses information theoretic methods to identify unusual images within large data sets.

HeroJan2010

Alfred Hero

By | | No Comments

Alfred O. Hero, PhD, is the R. Jamison and Betty Williams Professor of Engineering at the University of Michigan and co-Director of the Michigan Institute for Data Science.

The Hero group focuses on building foundational theory and methodology for data science and engineering. Data science is the methodological underpinning for data collection, data management, data analysis, and data visualization. Lying at the intersection of mathematics, statistics, computer science, information science, and engineering, data science has a wide range of application in areas including: public health and personalized medicine, brain sciences, environmental and earth sciences, astronomy, materials science, genomics and proteomics, computational social science, business analytics, computational finance, information forensics, and national defense. The Hero group is developing theory and algorithms for data collection, analysis and visualization that use statistical machine learning and distributed optimization. These are being to applied to network data analysis, personalized health, multi-modality information fusion, data-driven physical simulation, materials science, dynamic social media, and database indexing and retrieval. Several thrusts are being pursued:

  1. Development of tools to extract useful information from high dimensional datasets with many variables and few samples (large p small n). A major focus here is on the mathematics of “big data” that can establish fundamental limits; aiding data analysts to “right size” their sample for reliable extraction of information. Areas of interest include: correlation mining in high dimension, i.e., inference of correlations between the behaviors of multiple agents from limited statistical samples, and dimensionality reduction, i.e., finding low dimensional projections of the data that preserve information in the data that is relevant to the analyst.
  2. Data representation, analysis and fusion on non-linear non-euclidean structures. Examples of such data include: data that comes in the form of a probability distribution or histogram (lies on a hypersphere with the Hellinger metric); data that are defined on graphs or networks (combinatorial non-commutative structures); data on spheres with point symmetry group structure, e.g., quaternion representations of orientation or pose.
  3. Resource constrained information-driven adaptive data collection. We are interested in sequential data collection strategies that utilize feedback to successively select among a number of available data sources in such a way to minimize energy, maximize information gains, or minimize delay to decision. A principal objective has been to develop good proxies for the reward or risk associated with collecting data for a particular task (detection, estimation, classification, tracking). We are developing strategies for model-free empirical estimation of surrogate measures including Fisher information, R'{e}nyi entropy, mutual information, and Kullback-Liebler divergence. In addition we are quantifying the loss of plan-ahead sensing performance due to use of such proxies.
Correlation mining pipeline transforms raw high dimensional data (bottom) to information that can be rendered in interpretable sparse graphs and networks, simple screeplots, and denoised images (top). The pipeline controls data collection, feature extraction and correlation mining by integrating domain information and its assessed value relative to the desired task (on left) and accounting for constraints on data collection budget and uncertainty bounds (on right).

Correlation mining pipeline transforms raw high dimensional data (bottom) to information that can be rendered in interpretable sparse graphs and networks, simple screeplots, and denoised images (top). The pipeline controls data collection, feature extraction and correlation mining by integrating domain information and its assessed value relative to the desired task (on left) and accounting for constraints on data collection budget and uncertainty bounds (on right).