摘要:SummaryCurrent statistical models for drug response prediction and biomarker identification fall short in leveraging the shared and unique information from various cancer tissues and multi-omics profiles. We developed mix-lasso model that introduces an additional sample group penalty term to capture tissue-specific effects of features on pan-cancer response prediction. The mix-lasso model takes into account both the similarity between drug responses (i.e., multi-task learning), and the heterogeneity between multi-omics data (multi-modal learning). When applied to large-scale pharmacogenomics dataset from Cancer Therapeutics Response Portal, mix-lasso enabled accurate drug response predictions and identification of tissue-specific predictive features in the presence of various degrees of missing data, drug-drug correlations, and high-dimensional and correlated genomic and molecular features that often hinder the use of statistical approaches in drug response modeling. Compared to tree lasso model, mix-lasso identified a smaller number of tissue-specific features, hence making the model more interpretable and stable for drug discovery applications.Graphical abstractDisplay OmittedHighlights•Pan-cancer cell lines provide a test bench for exploring gene-drug relationships•Multi-omics data were integrated with pharmacological profiles for joint modeling•Mix-lasso identifies tissue-specific biomarkers predictive of multi-drug responses•Mix-lasso provides small number of stable features for drug discovery applicationsDrugs; Bioinformatics; Omics.