摘要:SummaryMany behaviors that are critical for survival and reproduction are expressed over extended time periods. The ability to inexpensively record and store large volumes of video data creates new opportunities to understand the biological basis of these behaviors and simultaneously creates a need for tools that can automatically quantify behaviors from large video datasets. Here, we demonstrate that 3D Residual Networks can be used to classify an array of complex behaviors in Lake Malawi cichlid fishes. We first apply pixel-based hidden Markov modeling combined with density-based spatiotemporal clustering to identify sand disturbance events. After this, a 3D ResNet, trained on 11,000 manually annotated video clips, accurately (>76%) classifies the sand disturbance events into 10 fish behavior categories, distinguishing between spitting, scooping, fin swipes, and spawning. Furthermore, animal intent can be determined from these clips, as spits and scoops performed during bower construction are classified independently from those during feeding.Graphical AbstractDisplay OmittedHighlights•A dataset of more than 14,000 annotated animal behavior videos was created•3D residual networks can be used to classify animal behavior•Different intents of similar behavioral actions can be distinguished•A working solution to study long-term behaviors was establishedPiscine Behavior; Zoology; Computer Science; Artificial Intelligence