Nowadays, the heterogeneity gap of different modalities is the key problem for cross-modal retrieval. In order to overcome heterogeneity gaps, potential correlations of different modalities need to be mined. At the same time, the semantic information of class labels is used to reduce the semantic gaps between different modalities data and realize the interdependence and interoperability of heterogeneous data. In order to fully exploit the potential correlation of different modalities, we propose a cross-modal retrieval framework based on graph regularization and modality dependence (GRMD). Firstly, considering the potential feature correlation and semantic correlation, different projection matrices are learned for different retrieval tasks, such as image query text (I2T) or text query image (T2I). Secondly, utilizing the internal structure of original feature space constructs an adjacent graph with semantic information constraints which can make different labels of heterogeneous data closer to the corresponding semantic information. The experimental results on three widely used datasets demonstrate the effectiveness of our method.