The Corso group’s main research thrust is high-level computer vision and its relationship to human language, robotics and data science. They primarily focus on problems in video understanding such as video segmentation, activity recognition, and video-to-text; methodology, models leveraging cross-model cues to learn structured embeddings from large-scale data sources as well as graphical models emphasizing structured prediction over large-scale data sources are their emphasis. From biomedicine to recreational video, imaging data is ubiquitous. Yet, imaging scientists and intelligence analysts are without an adequate language and set of tools to fully tap the information-rich image and video. His group works to provide such a language. His long-term goal is a comprehensive and robust methodology of automatically mining, quantifying, and generalizing information in large sets of projective and volumetric images and video to facilitate intelligent computational and robotic agents that can natural interact with humans and within the natural world.