26 - 30 April 2026
National Harbor, Maryland, US
Conference 14037 > Paper 14037-60
Paper 14037-60

Frequency-aware open-vocabulary object detection for thermal imaging

On demand | Presented live 28 April 2026

Abstract

Open-vocabulary object detection (OVD) is a promising approach for scaling thermal imaging applications in real-world environments such as HVAC monitoring and industrial inspection. However, applying visible-light OVD foundation models to thermal imagery is non-trivial due to the modality gap, which includes distinct radiometric distributions and characteristic spatial-frequency content. While prompt-based adaptation can preserve the open-vocabulary capability of frozen detectors, existing visual prompting approaches typically do not explicitly account for frequency-specific properties of thermal images. In this work, we propose frequency-aware prompt tuning for thermal-domain OVD. Our method decomposes a thermal image into low-/high-frequency components and uses a compact frequency-aware U-net to generate an additive input-space prompt, improving structural perception while maintaining radiometric consistency. Experiments on the FLIR-IR dataset with YOLO-World show consistent gains over zero-shot inference and a strong modality-prompting baseline, particularly for small and low-contrast objects.

Presenter

Jia Qu
Mitsubishi Electric Corp (Japan)
Head researcher at Mitsubishi Electric Corporation, Japan, focusing on computer vision and thermal imaging applications.
Application tracks: AI/ML
Presenter/Author
Jia Qu
Mitsubishi Electric Corp (Japan)
Author
Shotaro Miwa
Mitsubishi Electric Corp. (Japan)