ENTER: Event Based Interpretable Reasoning for VideoQA
📰 ArXiv cs.AI
arXiv:2501.14194v2 Announce Type: replace-cross Abstract: In this paper, we present ENTER, an interpretable Video Question Answering (VideoQA) system based on event graphs. Event graphs convert videos into graphical representations, where video events form the nodes and event-event relationships (temporal/causal/hierarchical) form the edges. This structured representation offers many benefits: 1) Interpretable VideoQA via generated code that parses event-graph; 2) Incorporation of contextual vis
DeepCamp AI