Application of LSTM-CNN in skiing action recognition under artificial intelligence technology

Sci Rep. 2026 Mar 2;16(1):11547. doi: 10.1038/s41598-026-42324-2. ABSTRACT This study proposes a deep learning model integrated with visual perception, aiming to enhance the accuracy of automatic skiing action recognition in complex scenarios. These scenarios include background…

Open original articleExtraction: feed_summaryCached 11 May 2026, 6:38 am

Actions

Reader

Sci Rep. 2026 Mar 2;16(1):11547. doi: 10.1038/s41598-026-42324-2.

ABSTRACT

This study proposes a deep learning model integrated with visual perception, aiming to enhance the accuracy of automatic skiing action recognition in complex scenarios. These scenarios include background interference from trees and snow mounds, lighting variations between cloudy and sunny days, and common body self-occlusions in skiers' movements. The model adopts a two-stream three-dimensional convolutional network-bidirectional long short-term memory (C3D-BiLSTM) architecture. The Red Green Blue (RGB) stream extracts movement features through the three-dimensional convolutional network (C3D), while the saliency perception stream highlights motion regions using optical flow fields. Meanwhile, a learnable weighted fusion method is introduced into the model to effectively integrate the two-stream features. Finally, the Bidirectional Long Short-Term Memory (BiLSTM) model performs sequence modeling on the fused spatiotemporal features to extract complete movement dynamics. The bidirectional temporal modeling capability of the BiLSTM model enables the simultaneous capture of action context from both directions. This provides a more comprehensive understanding of the start and end states of movements, thereby improving the recognition stability for complete action cycles. Experimental results on the SkiTB dataset demonstrate the following findings. (1) The proposed model outperforms other baseline models in four indicators-precision (92.8%), recall (91.9%), F1-score (0.923), and average precision (93.0%). (2) Ablation experiments verify the effectiveness of the BiLSTM, saliency perception stream, and weighted fusion method. (3) The model maintains an average recognition accuracy of over 85% in cross-scenario (cloudy days, light snow) and cross-athlete tests, and exhibits good stability against input noise. These conclusions indicate that by fully leveraging appearance and motion information, the model can effectively recognize complex skiing movements, providing new ideas and technical methods for intelligent sports analysis.

PMID:41772118 | PMC:PMC13057364 | DOI:10.1038/s41598-026-42324-2