Related Papers
Yanning Hou, Peiyuan Li, Zirui Liu +4
Zero-shot anomaly detection (ZSAD) requires detecting and localizing anomalies without access to target-class anomaly samples. Mainstream methods rely on vision-language models (VLMs) such as CLIP: they build hand-crafted or learned prompt sets for normal and abnormal semantics, then compute image-text similarities for open-set discrimination. While effective, this paradigm depends on a text encoder and cross-modal alignment, which can lead to training instability and parameter redundancy. This work revisits the necessity of the text branch in ZSAD and presents VisualAD, a purely visual framew
Yuzhi Huang, Chenxin Li, Haitao Zhang +9
Fei Li, Wenxuan Liu, Jingjing Chen +4
Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva +5
Han Hu, Wenli Du, Peng Liao +2
Wenbing Zhu, Lidong Wang, Ziqing Zhou +14
Sign in to access this content