出版社:Academy & Industry Research Collaboration Center (AIRCC)
摘要:In this paper, a target tracking algorithm, TriT(Triplet Network Based Tracker), based on
Triplet network is proposed to solve the problem of visual target tracking in complex scenes.
Compared with Siamese-fc algorithm, which adopts a two-way feature extraction network, TriT
uses three parallel convolutional neural networks to extract the features of the target in the first
frame, the target in the previous frame and the search regions of the current frame, and then
obtains the high-level semantic information of the three areas. Then, the features of the target
in the first frame and the target in the previous frame are respectively convolved with the
features of the current search region to obtain the similarity between each position in the
search area and the target in the first frame and the target in the previous frame, so as to
generate two similarity score maps. Then, interpolate and enlarge the two low-resolution score
maps, and use the APCE value of the score maps as the medium to fuse the two score maps,
according to which the position of the tracking target in the current frame can be located.
Experiments in this paper have confirmed that, compared with some other real-time target
tracking algorithms such as Siamese-fc, TriT has great advantages in tracking robustness and
can effectively execute tracking tasks in complex scenes, such as illumination change, occlusion
and interference of similar targets. Experimental results also show that the proposed algorithm
has good real-time performance.
关键词:Target Tracking; High Robustness; Triplet Network; Score Maps Fusion