AD Research Hub
DashboardPeoplePapersWorkshopsDatasetsResearch Map
Login
⌘K
AD Research Hub — Anomaly Detection in Computer Vision
← Back to People

Kristen Grauman

The University of Texas at Austin

Top CV ResearchersFrontier Research MapScore: 9h-index: 9936,396 citations
HomepageUT Austin homepageGen4AVC workshop pageSlides PDFSemantic Scholar
Top CV Researcher — Rank #6 (top 10)

Professor of Computer Science

Contributions

visual recognition, egocentric vision, embodied AI, video understanding

Why Selected

A leading figure in visual recognition and egocentric/embodied vision, with strong recent visibility in video-centered vision research.

Score Breakdown

2

historical impact

3

recent visibility

2

current influence

2

asset availability

9

total

Frontier Research Map

Featured Work

Discovering and Generating Action Sounds from Video

official workshop page — 2025-10-19

Why Now

Useful because it highlights that the frontier is not just image-text; it is temporally grounded multimodal perception rooted in action.

Key Ideas

  • -Egocentric video is a privileged route to affordances, intent, and physical interaction.
  • -Audio is a strong but underused signal for grounding actions and state changes in video.
  • -Current large multimodal models still miss temporal structure and evidence grounding in long video.

Open Questions

  • ?What is the right unit of understanding in video: frame, clip, state change, or action program?
  • ?How should multimodal systems represent causally meaningful sound rather than mere correlation?
  • ?Can egocentric video become for embodiment what web text was for language models?
Canonical CV Leadershigh confidence
Slides
Cross-References

Themes

egocentric visionvideoaudio-visual learning

Sign in to access this content

Sign in