2017 MIDAS Symposium

By | | No Comments

Please join us for the 2017 Michigan Institute for Data Science Symposium.

The keynote speaker will be Cathy O’Neil, mathematician and best-selling author of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”

Other speakers include:

  • Nadya Bliss, Director of the Global Security Initiative, Arizona State University
  • Francesca Dominici, Co-Director of the Data Science Initiative and Professor of Biostatistics, Harvard T.H. Chan School of Public Health
  • Daniela Whitten, Associate Professor of Statistics and Biostatistics, University of Washington
  • James Pennebaker, Professor of Psychology, University of Texas

More details are available at: http://midas.umich.edu/2017-symposium/

Call for Proposals: Amazon Research Awards, deadline 9/15/17

By | Data, Educational, Funding Opportunities, News, Research | No Comments

The Amazon Research Awards (ARA) program offers awards of up to $80,000 in cash and $20,000 in AWS promotional credits to faculty members at academic institutions in North America and Europe for research in these areas:

  • Computer vision
  • General AI
  • Knowledge management and data quality
  • Machine learning
  • Machine translation
  • Natural language understanding
  • Personalization
  • Robotics
  • Search and information retrieval
  • Security, privacy and abuse prevention
  • Speech

The ARA program funds projects conducted primarily by PhD students or post docs, under the supervision of the faculty member awarded the funds. To encourage collaboration and the sharing of insights, each funded proposal team is assigned an appropriate Amazon research contact. Amazon invites ARA recipients to speak at Amazon offices worldwide about their work, meet with Amazon research groups face-to-face, and encourages ARA recipients to publish their research outcome and commit related code to open-source code repositories.

Submissions are to be made online and details including rules and who may apply are located here.

SAVE THE DATE: MIDAS Annual Symposium, Oct. 11

By | Events, General Interest, News | No Comments

Please join us for the 2017 Michigan Institute for Data Science Symposium.

The keynote speaker will be Cathy O’Neil, mathematician and best-selling author of “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.”

Other speakers include:

  • Nadya Bliss, Director of the Global Security Initiative, Arizona State University
  • Francesca Dominici, Co-Director of the Data Science Initiative and Professor of Biostatistics, Harvard T.H. Chan School of Public Health
  • Daniela Whitten, Associate Professor of Statistics and Biostatistics, University of Washington
  • James Pennebaker, Professor of Psychology, University of Texas

More details, including how to register, will be available soon.

MIDAS working group on mobile sensor analytics

By | | No Comments

The Michigan Institute for Data Science (MIDAS) is convening a research working group on mobile sensor analytics. Mobile sensors are taking on an increasing presence in our lives. Wearable devices allow for physiological and cognitive monitoring, and behavior modeling for health maintenance, exercise, sports, and entertainment. Sensors in vehicles measure vehicle kinematics, record driver behavior, and increase perimeter awareness. Mobile sensors are becoming essential in areas such as environmental monitoring and epidemiological tracking.

There are significant data science opportunities for theory and application in mobile sensor analytics, including real-time data collection, streaming data analysis, active on-line learning, mobile sensor networks, and energy efficient mobile computing.

Our working group welcomes researchers with interest in mobile sensor analytics in any scientific domain, including but not limited to health, transportation, smart cities, ecology and the environment.

Agenda:

  • Brief presentations about challenges and opportunities in mobile sensor analytics (theory and application);

  • A brief presentation of a list of funding opportunities;

  • Discussion of research ideas and collaboration in the context of grant application and industry partnership.

Please RSVP.  For questions, please contact Jing Liu, Ph.D, MIDAS research specialist (ljing@umich.edu; 734-764-2750).

MIDAS starting research group on mobile sensor analytics

By | Educational, Events, General Interest, Happenings, News | No Comments

The Michigan Institute for Data Science (MIDAS) is convening a research working group on mobile sensor analytics. Mobile sensors are taking on an increasing presence in our lives. Wearable devices allow for physiological and cognitive monitoring, and behavior modeling for health maintenance, exercise, sports, and entertainment. Sensors in vehicles measure vehicle kinematics, record driver behavior, and increase perimeter awareness. Mobile sensors are becoming essential in areas such as environmental monitoring and epidemiological tracking.

There are significant data science opportunities for theory and application in mobile sensor analytics, including real-time data collection, streaming data analysis, active on-line learning, mobile sensor networks, and energy efficient mobile computing.

Our working group welcomes researchers with interest in mobile sensor analytics in any scientific domain, including but not limited to health, transportation, smart cities, ecology and the environment.

Where and When:

Noon to 2 pm, April 13, 2017

School of Public Health I, Room 7625

Lunch provided

Agenda:

  • Brief presentations about challenges and opportunities in mobile sensor analytics (theory and application);

  • A brief presentation of a list of funding opportunities;

  • Discussion of research ideas and collaboration in the context of grant application and industry partnership.

Future Plans: Based on the interest of participants, MIDAS will alert researchers to relevant funding opportunities, hold follow-up meetings for continued discussion and team formation as ideas crystalize for grant applications, and work with the UM Business Engagement Center to bring in industry partnership.

Please RSVP.  For questions, please contact Jing Liu, Ph.D, MIDAS research specialist (ljing@umich.edu; 734-764-2750).

Bing Liu, University of Illinois at Chicago – MIDAS Seminar Series

By | | No Comments

BingLiu

Bing Liu, PhD

University of Illinois, Chicago

Recorded Seminar

SLIDES

“Lifelong Machine Learning”

Abstract: Lifelong Machine Learning (or Lifelong Learning) is an advanced machine learning paradigm that learns continuously, accumulates the knowledge learned in the past, and uses it to help future learning. In the process, the learner becomes more and more knowledgeable and effective at learning. This learning ability is one of the hallmarks of human intelligence. However, the current dominant machine learning paradigm learns in isolation: given a training dataset, it runs a machine learning algorithm on the dataset to produce a model. It makes no attempt to retain the learned knowledge and use it in future learning. Although this isolated learning paradigm has been very successful, it requires a large number of training examples, and is only suitable for well-defined and narrow tasks. In comparison, we human can learn effectively with a few examples because we have accumulated so much knowledge in the past which enables us to learn with little data or effort. Lifelong learning aims to achieve this capability. As statistical machine learning matures, it is time to break the isolated learning tradition to study lifelong learning. Applications such as intelligent assistants, chatbots, and physical robots that interact with humans and systems in real-life environments are also calling for such lifelong learning capabilities. Without the ability to accumulate the learned knowledge and use it to learn more knowledge incrementally, a system will probably never be truly intelligent. In this talk, I will introduce lifelong learning, discuss related learning paradigms, and present some of our recent work on the topic.

 

Bio: Bing Liu is a professor of Computer Science at the University of Illinois at Chicago. He received his Ph.D. in Artificial Intelligence from the University of Edinburgh. His research interests include lifelong machine learning, sentiment analysis, data mining, machine learning, and natural language processing. He has published extensively in top conferences and journals in these areas. Two of his papers have received 10-year Test-of-Time awards from KDD, the premier conference of data mining and data science. He also authored four books: one on lifelong machine learning (coming later this month), one on Web data mining, and two on sentiment analysis. Some of his work has also been widely reported in the press, including a front-page article in the New York Times. On professional services, he serves as the current Chair of ACM SIGKDD. He has served as program chair of many leading data mining conferences, including KDD, ICDM, CIKM, WSDM, SDM, and PAKDD, as associate editor of leading journals such as TKDE, TWEB, and DMKD, and as area chair or senior PC members of numerous natural language processing, AI, Web research, and data mining conferences. He is a Fellow of ACM, AAAI and IEEE.

For more information on MIDAS or the Seminar Series, please contact midas-contact@umich.edu.

MIDAS gratefully acknowledges Northrop Grumman Corporation for its generous support of the MIDAS Seminar Series.

Tamara Kolda, PhD, Sandia National Labs – MIDAS Seminar Series

By | | No Comments

 tammy

Tamara Kolda, PhD

Sandia National Labs

SLIDES

An Overview of Tensor Decompositions for Data Analysis,with Emphasis on Computation and Scalability

 

Abstract: Tensors are multiway arrays, and tensor decompositions are powerful tools for data analysis and compression. In this talk, we demonstrate the wide-ranging utility of both the canonical polyadic (CP) and Tucker tensor decompositions with examples in neuroscience, social networks, and combustion science. We explain the model-fitting challenges for CP, including nonconvexity and NP-hardness, as well as the benefits, including uniqueness of the decomposition and the interpretability the results. We discuss the different types of tensor decompositions. For instance, a different choice of the fit metric in CP leads to Poisson Tensor Factorization for count data. Tucker has several advantages compared to CP such as the ability to easily compute the rank and even the rank required for a specific level of approximation. We present new results in scalability for both methods. For CP, we present a novel randomization method that not only improves the speed of the computation but also its robustness. For Tucker, we present results on compressing massive data sets by orders of magnitude by discovery of latent low-dimensional manifolds.

Bio: Tamara G. Kolda is a Distinguished Member of the Technical Staff at Sandia National Laboratories in Livermore, CA. She holds a Ph.D. in applied mathematics from the University of Maryland at College Park and is a past Householder Postdoctoral Fellow in Scientific Computing at Oak Ridge National Laboratory. She has received several awards for her work including a 2003 Presidential Early Career Award for Scientists and Engineers (PECASE), an R&D 100 Award, and three best paper prizes. She is a Distinguished Scientist of the Association for Computing Machinery (ACM) and a Fellow of the Society for Industrial and Applied Mathematics (SIAM). She is currently a member of the SIAM Board of Trustees, Section Editor for the Software and High Performance Computing Section for the SIAM Journal on Scientific Computing, and Associate Editor for the SIAM Journal on Matrix Analysis and Applications.

For more information on MIDAS or the Seminar Series, please contact midas-contact@umich.edu. MIDAS gratefully acknowledges Northrop Grumman Corporation for its generous support of the MIDAS Seminar Series.

Jacob Abernethy, PhD, University of Michigan- MIDAS Seminar Series

By | | No Comments

 399d617

Jacob Abernethy, PhD

Electrical Engineering and Computer Science

‘Statistical and Algorithmic Tools to Aid Recovery in Flint’

 Recording

Abstract: Recovery from the Flint Water Crisis has been hindered by uncertainty in both the water testing process and the causes of contamination. On the other hand, city, state, and federal officials have been collecting and organizing a significant amount of data, including many thousands of water samples, information on pipe materials, and city records. Combining all of this information, and utilizing state-of-the-art algorithmic and statistical tools, we have be able to develop a clearer picture as to the source of the problems, to accurately estimate the greatest risks, and to more efficiently direct resources towards recovery.

Bio: Jacob Abernethy is an Assistant Professor in the EECS Department at the University of Michigan, Ann Arbor. He finished his PhD in Computer Science at the UC Berkeley, and was a Simons postdoctoral fellow at the University of Pennsylvania. Jake’s primary interest is in Machine Learning, and he likes discovering connections between Optimization, Statistics, and Economics.

For more information on MIDAS or the Seminar Series, please contact midas-contact@umich.edu. MIDAS gratefully acknowledges Northrop Grumman Corporation for its generous support of the MIDAS Seminar Series.

Rebecca Willett, PhD, University of Wisconsin – Shannon Centennial Lecture Series

By | | No Comments

willett-rebecca

Rebecca Willet, PhD

‘Estimating High-Dimensional Autoregressive Point Processes’

[Recording]

Abstract: Vector autoregressive models characterize a variety of time series in which linear combinations of current and past observations can be used to accurately predict future observations. For instance, each element of an observation vector could correspond to a different node in a network, and the parameters of an autoregressive model would correspond to the impact of the network structure on the time series of observations at each network node. Of particular interest are autoregressive point processes, in which observations consist of the times at which each node participates in some event or activity. Such data is common in spike train observations of biological neural networks, interactions within a social network, and pricing changes within financial networks. However, very little is known about how many events must be recorded before we may accurately infer the underlying autoregressive models. I will describe sparsity-regularized methods and associated performance bounds which provide new insight into the sample complexity of these problems in high dimensions. While sparsity-regularization is well-studied in the statistics and machine learning communities, common assumptions from that literature (such as the restricted eigenvalue condition) are difficult to verify in this setting because of the correlations and heteroscedasticity of the observations. A novel analysis method leveraging a combination of Martingale concentration inequalities and high-dimensional linear regression characterizes how much data must be collected to ensure reliable inference depending on the size and sparsity of the autoregressive parameters, and these bounds are supported by several empirical studies.

Bio: Rebecca Willett is an Associate Professor of Electrical and Computer Engineering, Harvey D. Spangler Faculty Scholar, and Fellow of the Wisconsin Institutes for Discovery at the University of Wisconsin-Madison. She completed her PhD in Electrical and Computer Engineering at Rice University in 2005 and was an Assistant then tenured Associate Professor of Electrical and Computer Engineering at Duke University from 2005 to 2013. Willett received the National Science Foundation CAREER Award in 2007, is a member of the DARPA Computer Science Study Group, and received an Air Force Office of Scientific Research Young Investigator Program award in 2010. Willett has also held visiting researcher or faculty positions at the University of Nice in 2015, the Institute for Pure and Applied Mathematics at UCLA in 2004, the University of Wisconsin-Madison 2003-2005, the French National Institute for Research in Computer Science and Control (INRIA) in 2003, and the Applied Science Research and Development Laboratory at GE Healthcare in 2002.

For more information on MIDAS or the Seminar Series, please contact midas-contact@umich.edu. MIDAS gratefully acknowledges Northrop Grumman Corporation for its generous support of the MIDAS Seminar Series.

Graduate Studies in Computational & Data Sciences Info Session – Central Campus

By | | No Comments

2016-06-14 11.13.52Learn about graduate programs that will prepare you for success in computationally intensive fields — pizza and pop provided

  • The Ph.D. in Scientific Computing is open to all Ph.D. students who will make extensive use of large-scale computation, computational methods, or algorithms for advanced computer architectures in their studies. It is a joint degree program, with students earning a Ph.D. from their current departments, “… and Scientific Computing” — for example, “Ph.D. in Aerospace Engineering and Scientific Computing.”
  • The Graduate Certificate in Computational Discovery and Engineering trains graduate students in computationally intensive research so they can excel in interdisciplinary HPC-focused research and product development environments. The certificate is open to all students currently pursuing Master’s or Ph.D. degrees at the University of Michigan. This year we will offer a new practicum option through the Multidisciplinary Design Program.
  • The Graduate Certificate in Data Science is focused on developing core proficiencies in data analytics:
    1) Modeling — Understanding of core data science principles, assumptions and applications;
    2) Technology — Knowledge of basic protocols for data management, processing, computation, information extraction, and visualization;
    3) Practice — Hands-on experience with real data, modeling tools, and technology resources.