Search
Publication Authors

Prof. Dr. Didier Stricker

Dr. Alain Pagani

Dr. Gerd Reis

Eric Thil

Keonna Cunningham

Monika Miersch

Dr. Oliver Wasenmüller

Dr. Muhammad Zeshan Afzal

Dr. Gabriele Bleser

Dr. Muhammad Jameel Nawaz Malik

Dr. Bruno Mirbach

Dr. Jason Raphael Rambach

Dr. Nadia Robertini

Dr. René Schuster

Dr. Bertram Taetz

Ahmed Aboukhadra

Sk Aziz Ali

Mhd Rashed Al Koutayni

Yuriy Anisimov

Muhammad Asad Ali

Jilliam Maria Diaz Barros

Ramy Battrawy
Katharina Bendig
Hammad Butt

Mahdi Chamseddine
Chun-Peng Chang
Steve Dias da Cruz
Fangwen Shu

Torben Fetzer

Ahmet Firintepe

Sophie Folawiyo

David Michael Fürst
Anshu Garg

Christiano Couto Gava
Suresh Guttikonda

Tewodros Amberbir Habtegebrial

Simon Häring

Khurram Azeem Hashmi

Dr. Anna Katharina Hebborn

Hamoun Heidarshenas
Henri Hoyez

Pragati Jaiswal

Alireza Javanmardi
M.Sc. Sai Srinivas Jeevanandam

Jigyasa Singh Katrolia

Matin Keshmiri

Andreas Kölsch
Ganesh Shrinivas Koparde
Onorina Kovalenko

Stephan Krauß
Paul Lesur

Michael Lorenz

Dr. Markus Miezal

Mina Ameli

Nareg Minaskan Karabid

Mohammad Minouei

Shashank Mishra

Pramod Murthy

Mathias Musahl
Peter Neigel

Manthan Pancholi

Mariia Podguzova

Praveen Nathan
Qinzhuan Qian
Rishav

Marcel Rogge
María Alejandra Sánchez Marín
Dr. Kripasindhu Sarkar

Alexander Schäfer

Pascal Schneider

Dr. Mohamed Selim

Tahira Shehzadi
Lukas Stefan Staecker

Yongzhi Su

Xiaoying Tan

Shaoxiang Wang
Christian Witte

Yaxu Xie

Vemburaj Yadav

Yu Zhou

Dr. Vladislav Golyanik

Dr. Aditya Tewari

André Luiz Brandão
Publication Archive
New title
- ActivityPlus
- AlterEgo
- AR-Handbook
- ARVIDA
- Auroras
- AVILUSplus
- Be-greifen
- Body Analyzer
- CAPTURE
- Co2Team
- COGNITO
- DAKARA
- Density
- DYNAMICS
- EASY-IMP
- ENNOS
- Eyes Of Things
- iACT
- IMCVO
- IVMT
- LARA
- LiSA
- Marmorbild
- Micro-Dress
- Odysseus Studio
- On Eye
- OrcaM
- PAMAP
- PROWILAN
- ServiceFactory
- STREET3D
- SUDPLAN
- SwarmTrack
- TuBUs-Pro
- VIDETE
- VIDP
- VisIMon
- VISTRA
- VIZTA
- You in 3D
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
Spatio-Temporal Learnable Proposals for End-to-End Video Object Detection
Khurram Azeem Hashmi, Didier Stricker, Muhammad Zeshan Afzal
British Machine Vision Conference. British Machine Vision Conference (BMVC-2022) 33rd British Machine Vision Conference British Machine Vision Association England 11/2022 .
- Abstract:
- This paper presents the novel idea of generating object proposals by leveraging temporal information for video object detection. The feature aggregation in modern region-based video object detectors heavily relies on learned proposals generated from a single frame RPN. This imminently introduces additional components like NMS and produces unreliable proposals on low-quality frames. To tackle these restrictions, we present SparseVOD, a novel video object detection pipeline that employs Sparse R-CNN to exploit temporal information. In particular, we introduce two modules in the dynamic head of Sparse R-CNN. First, the Temporal Feature Extraction module based on the Temporal RoI Align operation is added to extract the RoI proposal features. Second, motivated by sequence-level semantic aggregation, we incorporate the attention-guided Semantic Proposal Feature Aggregation module to enhance object feature representation before detection. The proposed SparseVOD effectively alleviates the overhead of complicated post-processing methods and makes the overall pipeline end-to-end trainable. Extensive experiments show that our method significantly improves the single-frame Sparse RCNN by 8%-9% in mAP. Furthermore, besides achieving state-of-the-art 80.3% mAP on the ImageNet VID dataset with ResNet-50 backbone, our SparseVOD outperforms existing proposal-based methods by a significant margin on increasing IoU thresholds (IoU > 0.5).