摘要 : Although various video saliency models have achieved considerable performance gains, existing deep learning-based audio-visual saliency prediction models are still in the early exploration stage. The major challenge is that there ... 展开
作者 | Dandan Zhu Kun Zhu Weiping Ding Nana Zhang Xiongkuo Min Guangtao Zhai Xiaokang Yang |
---|---|
作者单位 | |
页码/总页数 | 1756-1771 / 16 |
语种/中图分类号 | 英语 / TM0 |
关键词 | Training Predictive models Feature extraction Visualization Task analysis Object detection Transformers |
DOI | 10.1109/TETCI.2024.3358184 |
馆藏号 | IELEP0446 |