{"id":18852,"date":"2026-05-06T06:54:07","date_gmt":"2026-05-06T06:54:07","guid":{"rendered":"https:\/\/cp.snarskis.lt\/index.php\/2026\/05\/06\/study-finds-ai-still-struggles-to-read-social-cues-in-video-a-hurdle-for-self-driving-cars-and-robots\/"},"modified":"2026-05-06T06:54:07","modified_gmt":"2026-05-06T06:54:07","slug":"study-finds-ai-still-struggles-to-read-social-cues-in-video-a-hurdle-for-self-driving-cars-and-robots","status":"publish","type":"post","link":"https:\/\/cp.snarskis.lt\/index.php\/2026\/05\/06\/study-finds-ai-still-struggles-to-read-social-cues-in-video-a-hurdle-for-self-driving-cars-and-robots\/","title":{"rendered":"Study finds AI still struggles to read social cues in video, a hurdle for self-driving cars and robots"},"content":{"rendered":"<p>Humans still outperform today\u2019s artificial intelligence at interpreting social interactions in moving scenes, a skill that underpins safer self-driving cars and more helpful assistive robots. New research from Johns Hopkins University suggests many leading models miss context that people grasp quickly.<\/p>\n<p>The team examined how well AI systems can infer intentions, relationships, and ongoing actions when people share a scene. These judgments help determine whether two pedestrians are chatting, about to cross the street, or reacting to one another.<\/p>\n<h2>Testing AI against human perception<\/h2>\n<p>In the study, participants watched three-second video clips and rated social features on a one-to-five scale. The clips showed people interacting, doing side-by-side activities, or acting independently.<\/p>\n<p>Researchers then asked more than 350 AI language, video, and image models to predict human ratings and expected brain responses. For large language models, the systems evaluated short, human-written captions describing the videos.<\/p>\n<h2>Where models fell behind<\/h2>\n<p>People largely agreed with one another across questions, but the AI models did not show the same consistency, regardless of size or training data. Video models often struggled to describe what people were doing, and image models given still frames could not reliably detect communication.<\/p>\n<p>Language models were comparatively better at predicting how humans would judge behavior, while video models were more aligned with predicted neural activity. Even so, none of the model types matched human responses across the board.<\/p>\n<h2>Why reading the room is hard<\/h2>\n<p>The researchers argue the gap highlights a difference between recognizing objects in static images and understanding the unfolding story in real life. They suggest a potential cause is that many AI architectures draw inspiration from brain systems tuned for static vision rather than dynamic social scenes.<\/p>\n<p>Lead author Leyla Isik said an autonomous vehicle needs to read intentions and goals, not just identify people and objects. Co-first author Kathy Garcia added that social relationships, context, and dynamics appear to be a persistent blind spot in current model development.<\/p>\n<p>The findings are being presented at the International Conference on Learning Representations, where researchers will discuss implications for AI that must interact safely with humans. The work adds to a growing body of evidence that high scores on benchmarks do not always translate to robust real-world understanding.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Johns Hopkins researchers found many AI models still miss social context in short videos, a key limitation for safer self-driving cars and assistive robots.<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[37],"tags":[102,10087,5430,10089,10088,10090],"miestas":[],"class_list":["post-18852","post","type-post","status-publish","format-standard","hentry","category-relationships","tag-dirbtinis-intelektas","tag-johns-hopkins-university","tag-kompiuterine-rega","tag-robotai","tag-savavaldziai-automobiliai","tag-socialiniai-signalai"],"acf":[],"_links":{"self":[{"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/posts\/18852","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/comments?post=18852"}],"version-history":[{"count":0,"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/posts\/18852\/revisions"}],"wp:attachment":[{"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/media?parent=18852"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/categories?post=18852"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/tags?post=18852"},{"taxonomy":"miestas","embeddable":true,"href":"https:\/\/cp.snarskis.lt\/index.php\/wp-json\/wp\/v2\/miestas?post=18852"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}