VUOD: A Versatile Unsupervised Outlier Detection Framework for Images

Accepted by ECCV 2026

1SmartMore Corporation, 2The Hong Kong University of Science and Technology 3University College London 4Rice University

Indicates corresponding author.

Teaser

Teaser Image

Left plot: VUOD significantly outperforms existing SOTAs on UOD benchmark datasets and real-world applications. More importantly, VUOD can be seamlessly integrated into various downstream visual tasks, such as image classification (middle plot) and 3D reconstruction with VGGT (right plot).

Abstract

Unsupervised outlier detection that automatically identifies whether visual systems involve anomalous images is a significant research topic. However, most approaches are constrained to natural images, as they rely on frozen, pre-trained feature extractors. So that their performance cannot be maintained in practical scenarios, especially for industrial inspection and medical imaging. In this work, we first revisit this task and then introduce a versatile unsupervised outlier detection framework to enrich the application domains. The core idea of this framework is to improve feature discriminativess via exploring intrinsic distribution priors. Evaluated on 14 benchmark datasets, our proposed solution achieves state-of-the-art performance and significantly outperforms existing methods. More importantly, we show its plug-and-play property that can be integrated into diverse visual applications to improve their robustness, such as image classification and 3D reconstruction.

Local Separability

we adopt Affinity Propagation (AP), which will split the target dataset's features into a series of fine-grained clusters. All clusters can be divided into three categories: inlier clusters that each cluster mostly contains inliers; outlier clusters that each cluster mostly contains outliers; mixed clusters that contain both inliers and outliers.

Local Separability

Framework

An unlabeled target dataset is first processed by natural image pretrained feature extractors. We then investigate the intrinsic priors of the feature space (local and global separability), and generate pseudo labels. Based on predicted inliers and outliers, we apply contrastive learning to enhance the discriminativeness of the feature representations.

Framework

Experiments

Table 1: Average AUC results on a series of natural image datasets. The best results are highlighted in bold.

Method Venue STL-10 Internet Caltech-101 CIFAR-10 CIFAR-100 MIT-Places
Deep SVDD ICML-2018 0.593 0.674 0.745 0.533 0.575 0.539
RSRAE ICLR-2019 0.903 0.916 0.986 0.816 0.889 0.778
GOAD ICLR-2020 0.946 0.945 0.981 0.862 0.886 0.871
Shell-Re TPAMI-2021 0.866 0.896 0.905 0.867 0.834 0.822
NeuTraL ICML-2021 0.828 0.877 0.633 0.748 0.815 0.711
ICL ICLR-2022 0.929 0.937 0.964 0.859 0.911 0.832
LUNAR AAAI-2022 0.781 0.776 0.923 0.766 0.859 0.698
ECOD TKDE-2022 0.932 0.925 0.976 0.888 0.905 0.846
LVAD ECCV-2022 0.948 0.923 0.977 0.864 0.907 0.860
SLAD ICML-2023 0.923 0.919 0.888 0.861 0.899 0.846
DeepIF TKDE-2023 0.858 0.866 0.874 0.801 0.808 0.765
Multi-T ECCV-2024 0.957 0.956 0.985 0.888 0.917 0.859
FlexUOD CVPR-2025 0.980 0.981 0.983 0.942 0.947 0.928
VUOD (Ours) 0.994 0.991 0.994 0.972 0.981 0.971

Table 2: AUC results on industrial inspection and medical imaging (disease detection).

Type Dataset ResNet-18 Wide ResNet-101
RSRAE LVAD Multi-T FlexUOD VUOD RSRAE LVAD Multi-T FlexUOD VUOD
Industrial MVTec-AD 0.7090.7790.7570.7520.911 0.7700.7990.7800.7780.931
BTAD 0.8730.8940.8850.8820.902 0.8460.8840.8860.8830.928
MPDD 0.6180.5730.5630.5640.906 0.5920.6020.6030.5950.926
MVTec-LOCO 0.6550.6930.6610.6630.812 0.6290.6950.6570.6600.832
Medical LiverCT 0.6660.6630.6960.7080.813 0.7030.5710.6520.5670.838
BrainMRI 0.5410.6230.5830.5840.804 0.6760.6470.6520.5780.790
OCT 0.5240.7780.7300.6120.880 0.7130.8110.7250.5880.877
RESC 0.7120.8100.8230.9480.901 0.6620.7460.6440.5890.931

3D Dense Point Map Reconstruction Results of VGGT

3D Reconstruction

BibTeX

@article{liu2026vuod,
  author    = {Zhonghang Liu, Siyuan Chen, Jingwen Yu, Changshuo Wang, Kunyang Li, Jiangbo Lu},
  title     = {VUOD: A Versatile Unsupervised Outlier Detection Framework for Images},
  journal   = {European Conference on Computer Vision},
  year      = {2026},
}