With the advent of high throughput technology in medical data acquisition, there has been an explosion in the quantity of multimodality health data. We focus on imaging data, and develop data-driven algorithms and methodologies to extract clinically relevant information from images. Using ideas from machine learning, computer vision, and statistics, we aim to discover imaging biomarkers that can predict clinical outcomes of interest in a wide range of diseases, such as cancer, diabetes, and psychiatric disorders.