Single-agent vs. Multi-agents for Automated Video Analysis of On-Screen Collaborative Learning Behaviors

📰 ArXiv cs.AI

arXiv:2604.03631v1 Announce Type: new Abstract: On-screen learning behavior provides valuable insights into how students seek, use, and create information during learning. Analyzing on-screen behavioral engagement is essential for capturing students' cognitive and collaborative processes. The recent development of Vision Language Models (VLMs) offers new opportunities to automate the labor-intensive manual coding often required for multimodal video data analysis. In this study, we compared the p

Published 7 Apr 2026

Read full paper → ← Back to News