Comparing automated gaze classifiers in infant looking studies: Accuracy and vulnerability to environmental factors

Behav Res Methods. 2026 May 8;58(6):160. doi: 10.3758/s13428-026-03040-x. ABSTRACT We evaluated the performance and environmental robustness of three state-of-the-art gaze classification algorithms designed for infant looking-time research: iCatcher+, OWLET, and an Amazon Rekogn…

Open original articleExtraction: feed_summaryCached 9 May 2026, 6:06 pm

Actions

Reader

Behav Res Methods. 2026 May 8;58(6):160. doi: 10.3758/s13428-026-03040-x.

ABSTRACT

We evaluated the performance and environmental robustness of three state-of-the-art gaze classification algorithms designed for infant looking-time research: iCatcher+, OWLET, and an Amazon Rekognition-based model. Gaze classifications for each algorithm were compared to human-coded data using a novel dataset (N = 47), and iCatcher+ demonstrated the highest agreement (78.4-85.4%). We then investigated the effect of environmental factors commonly encountered in webcam-based home experiments with infants. We quantified six factors: distance to the camera; infants' left-right offset; facial rotation; facial movement; facial brightness; and spatial variability in facial brightness. Suboptimal recording conditions led to performance degradation for all algorithms. Even iCatcher+, while the most accurate overall, was susceptible, particularly when facial illumination was uneven (i.e., strong brightness variability) or when the head position moved substantially. These findings provide practical insights into the selection and deployment of gaze classification tools for infant research, and can be used to optimize instructions for participants in home webcam experiments. This study contributes to improving methodological transparency and reliability in remote infant eye-tracking research.

PMID:42104150 | PMC:PMC13156148 | DOI:10.3758/s13428-026-03040-x