iSafetyBench: A video-language benchmark for safety in industrial environment

University of Central Florida
VISION'25 Workshop - ICCVW 25
Normal scenario

Normal scenarios

Dangerous/Hazard scenario
Normal scenario

Dangerous/Hazard scenarios

Dangerous/Hazard scenario

Performance comparison of models

Abstract

Recent advances in vision-language models (VLMs) have enabled impressive generalization across diverse video understanding tasks under zero-shot settings. However, their capabilities in high-stakes industrial domains—where recognizing both routine operations and safety-critical anomalies is essential—remain largely underexplored. To address this gap, we introduce iSafetyBench, a new video-language benchmark specifically designed to evaluate model performance in industrial environments across both normal and hazardous scenarios. iSafetyBench comprises 1,100 video clipssourced from real-world industrial settings, annotated with open-vocabulary, multi-label action tagsspanning 98 routine and 67 hazardousaction categories. Each clip is paired with multiple-choice questionsfor both single-label and multi-label evaluation, enabling fine-grained assessment of VLMs in both standard and safety-critical contexts. We evaluate eight state-of-the-art video-language modelsunder zero-shot conditions. Despite their strong performance on existing video benchmarks, these models struggle with iSafetyBench—particularly in recognizing hazardous activities and in multi-label scenarios. Our results reveal significant performance gaps, underscoring the need for more robust, safety-aware multimodal models for industrial applications. iSafetyBench provides a first-of-its-kind testbed to drive progress in this direction.

Comparison with existing datasets

Dataset Normal
scenarios
Dangerous
scenarios
Multi-label Textual
data
Environment
type(s)
Set
type
# Normal
actions
# Non-critical
anomaly actions
# Danger/hazard
actions
# High-level
categories
UCF-Crime ✔ ✔ ✖ ✖ Multiple Closed 0 0 13 0
InHARD ✔ ✖ ✖ ✖ Single Closed 74 0 0 14
TIMo ✔ ✖ ✖ ✖ Single Closed 35 21 0 20
OpenPack ✔ ✔ ✖ ✖ Single Closed 43 44 1 17
Safe/Unsafe Behaviours ✔ ✔ ✖ ✖ Single Closed 4 0 4 2
Construction Meta Action ✔ ✔ ✖ ✖ Single Closed 1 0 6 0
iSafetyBench (Ours) ✔ ✔ ✔ ✔ Multiple Open 98 0 67 18

BibTeX

        
        @InProceedings{Abdullah_2025_ICCV,
            author    = {Abdullah, Raiyaan and Rawat, Yogesh Singh and Vyas, Shruti},
            title     = {iSafetyBench: A video-language benchmark for safety in industrial environment},
            booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
            month     = {October},
            year      = {2025},
            pages     = {}
        }