Audio-visual integration interacts with attentional mechanisms. Additionally, salient auditory stimuli automatically draw attention to an audio-visual event, while spatial attention can modulate audio-visual integration. Attention induced by auditory inputs (sound-driven attention) facilitates visual perception. Similarly, visual attention improves performance on a visual task. However, the difference between attention driven by auditory and visual cues is not clear. When visual attention facilitates visual perception, there is a trade-off between spatial and temporal resolution. In contrast, audition has superior temporal resolution to vision. In the present study, we investigated the difference between auditory and visual cue-driven attention with respect to this trade-off. The results indicated that visual cueing increased spatial resolution but decreased temporal resolution. On the other hand, auditory cueing affected the efficiency of visual processing (i.e., response time) for temporal gap detection. These findings suggest that auditory cueing capitalizes on resources available for visual processing. In contrast, visual cueing may increase activation of the spatial channel instead of inhibiting the temporal channel, as proposed in previous study. Overall, there appear to be clear differences between mechanisms involved in auditory and visual cues-driven attention.