Taking place on April 12-14, 2023 in Ann Arbor, Michigan, this event offers outstanding graduate students and postdocs from around the US the opportunity to engage in research discussions with peers and with research leaders, and receive career mentoring, as they grow to become future research leaders in data science and Artificial Intelligence (AI).
2023 Future Leaders Summit: “Responsible data science and AI”
Data science and AI are having a significant impact on society in uncountable ways, leading to huge benefits in many cases. Yet, increasingly complex analytical pipelines working with poorly understood heterogeneous data sets can give rise to harms in many ways. Furthermore, there could be deleterious systemic effects such as the magnification of disinformation or surveillance capitalism. There has been tremendous recent interest in understanding and managing these concerns. Together with thirty young scholars, the Summit will explore in-depth topics in this broad area, including, but not limited to:
- Equity and fairness, particularly in automated decision making
- Explainability of analytical results
- Reproducibility and replication of scientific results
- Systemic issues, particularly those impacting marginalized populations
- New in 2023: Responsible AI in science and engineering. MIDAS has recently established a large postdoctoral training program (the Eric and Wendy Schmidt AI in Science Postdoctoral Fellowship, a Schmidt Futures Program). With this, we are dedicated to promoting responsible data science and AI for natural sciences and engineering.
- Research talks by the attendees and by leading researchers.
- Career mentoring sessions by University of Michigan faculty members, department chairs, and industry representatives.
- Networking sessions with other attendees, University of Michigan faculty and trainees, and industry representatives.
Responsible Data Science and AI mini-symposium
Thursday, April 13, 2023, 1:30 – 5:15 PM
Rackham Amphitheatre, 4th Floor, 915 E. Washington St. Ann Arbor
“Equity in Data Science”
Data Science continues to have a transformative impact on society, by enabling evidence-based decision making, reducing costs and errors, and improving objectivity. The techniques and technologies of data science also have enormous potential for harm if they reinforce inequity or leak private information. As a result, sensitive datasets in the public and private sector are restricted from research use, slowing progress in those areas that have the most to gain: human services in the public sector. Furthermore, the misuse of data science techniques and technologies will disproportionately harm underrepresented groups across race, gender, physical ability, sexual orientation, education, and more. These data equity issues are pervasive, and represent an existential risk for the use of data-driven methods in science and engineering. In this talk, I will describe a framework to think about these issues, and some initial directions where we have made progress.
“Building a culture of Responsible AI (and what it means for researchers)”
At Microsoft, responsible AI governance is crucial to guiding our AI innovation and creating AI that warrants people’s trust. We are putting our Responsible AI principles into practice by taking a people-centered approach to the research, development, and deployment of AI. In this talk, I will talk about Microsoft’s AI principles, the building blocks of operationalizing these principles across the company and our journey to creating a culture of Responsible AI.
“Human-machine partnership for conservation: AI and humans combatting extinction together”
We are losing the planet’s biodiversity at an unprecedented rate and in many cases, we do not even have the basic numbers. Photographs, taken by field scientists, tourists, citizen scientists, automated cameras, incidental photographers, and collected from social media are the most abundant source of data on wildlife today. Data science and machine learning can turn massive collections of images into high resolution information database about wildlife, enabling scientific inquiry, conservation, and policy decisions. Machine learning and artificial intelligence have advanced significantly over the past decade. Nonetheless, to successfully address biodiversity crisis and other societal challenges, we need the complementary capabilities of both humans and machines, in partnership.
I will show an example of how data-driven, AI-enabled decision process becomes trustworthy by opening a wide diversity of opportunities for participation, supporting community-building, addressing the inherent data and computational biases, and providing transparent measures of performance. The community becomes the decision-maker, and AI scales the community, as well as the puzzle of data and solutions, to the planetary scale.
“From interstellar rocks to dark energy: building data science across research communities”
Over the last 10 years data science has changed the way that we approach science and research. In astronomy alone, it has impacted how we study scales as small as asteroids within our Solar System all the way to measures of the properties of the dark energy that drives the expansion of the universe. As data science becomes integral to the scientific process there are many opportunities and challenges that we face; from the development of ethical approaches to AI to how we educate a new generation of researchers to create robust and reproducible science. In this talk Professor Connolly will discuss the evolution of data science at the University of Washington including teaching data science across the campus, incubator programs to jumpstart data science research, and the impact of Data Science for Social Good programs on the local community. Taking the lens of astronomy’s approach to data science he will focus on examples of machine learning and AI that are being used to optimize the largest astronomical survey of the night sky; the Rubin Observatory’s Legacy Survey of Space and Time including searches for planet IX within the outer Solar System. Rubin will produce the largest multicolor movie of our night sky ever undertaken. Combining its petabytes of data with new approaches to data science will profoundly transform our understanding of the universe.
About the Faculty Mentors
Prominent researchers from academia and industry will serve as mentors for the attendees, offering career guidance and research insight. The 2023 mentors include:
Director, Translational Data Analytics Institute; Professor, Computer Science and Engineering, Evolution, Ecology, and Organismal Biology; and Electrical and Computer Engineering
The Ohio State University
Dr. Tanya Berger-Wolf is a Professor of Computer Science Engineering, Electrical and Computer Engineering, and Evolution, Ecology, and Organismal Biology at the Ohio State University, where she is also the Director of the Translational Data Analytics Institute. Recently she has been awarded US National Science Foundation $15M grant to establish a new Harnessing Data Revolution Institute, founding a new field of study: Imageomics. As a computational ecologist, her research is at the unique intersection of computer science, wildlife biology, and social sciences. Berger-Wolf is also a director and co-founder of the conservation software non-profit Wild Me, home of the Wildbook project, which brings together computer vision, crowdsourcing, and conservation. Wildbook has been recently chosen by UNSECO as one of the top AI 100 projects worldwide supporting the UN Sustainable Development Goals. It has been featured in media, including Forbes, The New York Times, CNN, National Geographic, The Economist. Berger-Wolf has given hundreds of talks about her work, including at TED/TEDx, UN/UNESCO AI for the Planet.
Prior to coming to OSU in January 2020, Berger-Wolf was at the University of Illinois at Chicago. She has received numerous awards for her research and mentoring, including University of Illinois Scholar, UIC Distinguished Researcher of the Year, US National Science Foundation CAREER, Association for Women in Science Chicago Innovator, and the UIC Mentor of the Year. She is the subject matter editor for the Ecosphere emergent technologies. Berger-Wolf is a member of the CNRS International Scientific Advisory Board, Artificial Intelligence for Science, Science for Artificial Intelligence (AISSA) Centre. She served on the Global Partnership on AI (GPAI) AI on Biodiversity working group, WWF working group on AI Collaboration to End Wildlife Trafficking, AAAS-FBI Big Data in the Life Sciences and National Security Working Group, and the organizing committee of the National Academies First U.S.-Africa Frontiers of Science, Engineering, and Medicine Symposium, among many others.
Director, eScience Institute; Associate Vice Provost for Data Science; and the William P. and Ruth Gerberding University Professor
University of Washington
Andrew Connolly is a professor of astronomy at the University of Washington and the William P. and Ruth Gerberding University Professor. He is the Director of the eScience Institute at the University of Washington (UW), an institute that supports data science research and education across the UW campus. His work focuses on the development and application of statistics and machine learning for large astronomical survey data sets. For the last decade, he has worked on various aspects of the design and construction of the Vera C Rubin Observatory that will operate the Legacy Survey of Space and Time (LSST). He was awarded a National Science Foundation CAREER award in 2000 to develop visualization techniques for complex data sets, which led to the creation of Google Sky while on sabbatical at Google in 2007. In 2017 he founded the DIRAC Institute at the University of Washington, a new center that focuses on data intensive astrophysics and cosmology. From this emerged the LINCC Frameworks initiative to develop robust and scalable software to support science from the LSST. He co-authored the book “Statistics, Data Mining and Machine Learning in Astronomy”, which was awarded the International Astrostatistics Association’s Outstanding Publication Award for 2016.
Director, Michigan Institute for Data Science; Edgar F Codd Distinguished University Professor; and Bernard A. Galler Professor of Computer Science and Engineering
University of Michigan
H. V. Jagadish is the Director of the Michigan Institute for Data Science, Edgar F. Codd Distinguished University Professor, and Bernard A. Galler Collegiate Professor of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor. Before his professorship, he was Head of the Database Research Department at AT&T Labs. Dr. Jagadish’s research focuses on two themes: the usability of database systems, query models and analytics processes to inform decision-makers, especially with big and heterogeneous data that go through many transformations; data equity systems that center around issues of representation, diversity, fairness, transparency, and validity. Dr. Jagadish is an elected ACM Fellow and AAAS Fellow. His many academic scholarship roles include establishing the ACM SIGMOD Digital Review and founding the Proceedings of the Very Large Database Endowment (PVLDB), serving on the boards of the Computing Research Association (CRA) and the Very Large Database Endowment.
Arthur F Thurnau Professor; Professor of Climate and Space Sciences and Engineering; and Director, Academic Program, Undergraduate Education, College of Engineering
University of Michigan
Mark Moldwin conducts space physics research attempting to understand the flow of energy, mass, and momentum from the Sun through the Earth’s space environment. Specifically, he is interested in magnetic structures and ULF waves in the heliosphere and magnetosphere, the distribution of plasma in the inner magnetosphere, the coupling of energy between the ionosphere and magnetosphere, and space weather. Moldwin develops and uses magnetometers that fly on spacecraft and are deployed around the world, remote sensing data (EUV and radio waves), and modeling (including machine learning and sensor fusion algorithm development). He also conducts faculty (K-12 and university) professional development around inclusive mentoring and teaching and formal and informal audience educational outreach.
Associate Director, Michigan Institute for Data Science and Associate Professor, Communication and Media
University of Michigan
Josh Pasek is Associate Professor of Communication Studies and Faculty Associate in the Center for Political Studies at the University of Michigan, and a MIDAS Associate Director. His substantive research explores how new media and psychological processes each shape political attitudes, public opinion, and political behaviors. Josh also examines issues in the measurement of public opinion including techniques for incorporating social trace data as a means of tracking attitudes and behaviors. Current research evaluates whether the use of online social networking sites such as Facebook and Twitter might be changing the political information environment, and assesses the conditions under which nonprobability samples, such as those obtained from big data methods or samples of Internet volunteers can lead to conclusions similar to those of traditional probability samples. His work has been published in Public Opinion Quarterly, Political Communication, Communication Research, and the Journal of Communication among other outlets. He also maintains two R packages for producing survey weights (anesrake) and analyzing weighted survey data (weights).
Senior Program Manager, Office of Responsible AI
Dr. Ellie Sakhaee is a Senior Program Manager at Microsoft’s Office of Responsible AI. She collaborates with teams across the company as well as external partners to operationalize Microsoft AI principles within products. She is also a Fellow at EqualAI where she writes about trustworthy AI. Prior to Microsoft, Ellie worked as a Technology Policy Fellow, advancing technology public policy, and as a Lead Machine Learning Scientist, working on self-driving vehicles. Ellie earned her Ph.D. from the University of Florida in the field of Mathematical Signal Processing (compressed sensing). She also holds an MBA and Masters degrees in Computer science and Computer Engineering. She is a co-author of AI-Index report 2022, an annual report from Stanford HAI group on recent trends in AI technology. Some of her thoughts on the future of Responsible AI are reflected in her interview with All Tech Is Human.
Our 2023 cohort is comprised of 30 outstanding graduate students and postdoctoral fellows from across 19 U.S. Universities.
- Fatima Ahsan, Rice University
- Abdullah Aman Tutul, Texas A&M University
- Saskia Comess, Stanford University
- Niloufar Dousti Mousavi, University of Illinois at Chicago
- Dara Farrell, University of Washington
- Ritika Giri, Northwestern University
- Victoria Granja, Rice University
- Rafal Kocielnik, California Institute of Technology
- Connor Lawless, Cornell University
- Yuanyuan Lei, Texas A&M University
- Joel E. Martinez, Harvard University
- Joseph McBride, Jackson State University
- Swapneel Mehta, New York University
- Vishwali Mhasawade, New York University
- Bernardo Modenesi, University of Michigan
- Dane Morey, The Ohio State University
- Chinasa T. Okolo, Cornell University
- Izunna Okpala, University of Cincinnati
- Evan Reynolds, University of Michigan
- Guangchun (Grant) Ruan, Massachusetts Institute of Technology
- Roshni Sahoo, Stanford University
- Saurav Sengupta, University of Virginia
- Arun Sharma, University of Minnesota
- Somya Sharma Chatterjee, University of Minnesota
- Mengyi Sun, Northwestern University
- Kshitij Tayal, University of Minnesota
- Zhanyu Wang, Purdue University
- Nan Wu, New York University
- Kelly Zhang, Harvard University
- Xingmeng Zhao, University of Texas at San Antonio
Follow-Up Sessions & News
Future Leaders C.V. and Web Presence Workshop
May 23, 2023: Offered to the 2023 cohort and all Future Leaders Summit alumni, this MIDAS workshop, led by Dr. Ken Reid (Data Scientist, MIDAS), and Dr. Joshua Pasek ( Associate Professor of Communication and Media, Associate Professor of Political Science, and Associate Director, MIDAS) offered a plethora of practical insights and strategies to aid early career data science trainees to enhance their visibility and marketability in academia. The session commenced with an overview of the numerous sources that facilitate academic works, including but not limited to GitHub, ResearchGate, Scopus, and Google Scholar. The event featured a session dedicated to the intricate process of crafting effective CVs and cover letters, as well as the navigation of the Ph.D. to postdoc career transition. Participants were introduced to specific tools, such as Overleaf, ChatGPT, and Hemingway, that are conducive to creating high-quality CVs and cover letters. Attendees were granted the opportunity to interact with and gain advice from experienced academics who shared their own professional journeys and answered queries. Attendees gained improved skill sets and were provided with relevant resources to augment their professional portfolio and effectively communicate their research success.
Future Leaders Three Minute Thesis (3MT)
What you will get: Three Minute Thesis (3MT) is an academic competition that challenges research candidates to condense their thesis into a concise and engaging presentation that can be delivered in three minutes or less, and understood by a non-specialist audience. Through this competition, participants can enhance their academic, presentation, and research communication skills by learning to effectively communicate complex research in a clear and engaging manner. Additionally, the competition provides an excellent opportunity for future leaders to exchange ideas and discuss their research with each other, fostering collaboration and introducing others to their work. The judges evaluate the presentations based on the clarity, engagement, and communication of the research topic, with the top three winners receiving the prestigious MIDAS 2023 3MT gold, silver, and bronze awards. Winning the competition can further enhance the professional development and networking prospects of the research candidates.
Evan Reynolds, Ph.D. writes on his experience as an attendee of the MIDAS Annual Future Leaders Summit, sharing his perspective on the theme of ‘Responsible Data Science and AI’ and how it affects his study of diabetes complications.