Search
Publication Authors

Prof. Dr. Didier Stricker

Dr. Alain Pagani

Dr. Gerd Reis

Eric Thil

Keonna Cunningham

Dr. Oliver Wasenmüller

Dr. Muhammad Zeshan Afzal

Dr. Gabriele Bleser

Dr. Muhammad Jameel Nawaz Malik
Dr. Bruno Mirbach

Dr. Jason Raphael Rambach

Dr. Nadia Robertini

Dr. René Schuster

Dr. Bertram Taetz

Ahmed Aboukhadra

Sk Aziz Ali

Mhd Rashed Al Koutayni

Yuriy Anisimov

Jilliam Maria Diaz Barros

Ramy Battrawy
Hammad Butt

Mahdi Chamseddine

Steve Dias da Cruz
Fangwen Shu

Torben Fetzer

Ahmet Firintepe
Sophie Folawiyo

David Michael Fürst
Anshu Garg

Christiano Couto Gava
Suresh Guttikonda

Tewodros Amberbir Habtegebrial
Simon Häring

Khurram Azeem Hashmi
Henri Hoyez
Alireza Javanmardi

Jigyasa Singh Katrolia

Andreas Kölsch
Ganesh Shrinivas Koparde
Onorina Kovalenko

Stephan Krauß
Paul Lesur

Michael Lorenz

Dr. Markus Miezal

Mina Ameli

Nareg Minaskan Karabid

Mohammad Minouei

Pramod Murthy

Mathias Musahl

Peter Neigel

Manthan Pancholi
Mariia Podguzova

Praveen Nathan
Qinzhuan Qian
Rishav
Marcel Rogge
María Alejandra Sánchez Marín
Dr. Kripasindhu Sarkar

Alexander Schäfer

Pascal Schneider

Mohamed Selim

Tahira Shehzadi
Lukas Stefan Staecker

Yongzhi Su

Xiaoying Tan
Christian Witte

Yaxu Xie

Vemburaj Yadav

Dr. Vladislav Golyanik

Dr. Aditya Tewari

André Luiz Brandão
Publication Archive
New title
- ActivityPlus
- AlterEgo
- AR-Handbook
- ARVIDA
- Auroras
- AVILUSplus
- Be-greifen
- Body Analyzer
- CAPTURE
- COGNITO
- DAKARA
- Density
- DYNAMICS
- EASY-IMP
- Eyes Of Things
- iACT
- IMCVO
- IVMT
- LARA
- LiSA
- Marmorbild
- Micro-Dress
- Odysseus Studio
- On Eye
- OrcaM
- PAMAP
- PROWILAN
- ServiceFactory
- STREET3D
- SUDPLAN
- SwarmTrack
- TuBUs-Pro
- VIDETE
- VIDP
- VisIMon
- VISTRA
- You in 3D
Rethinking Learnable Proposals for Graphical Object Detection in Scanned Document Images
Rethinking Learnable Proposals for Graphical Object Detection in Scanned Document Images
Sankalp Sinha, Khurram Azeem Hashmi, Alain Pagani, Marcus Liwicki, Didier Stricker, Muhammad Zeshan Afzal
Applied Sciences (MDPI) 12 20 Seiten 1-22 MDPI Switzerland 10/2022 .
- Abstract:
- In the age of deep learning, researchers have looked at domain adaptation under the pre-training and fine-tuning paradigm to leverage the gains in the natural image domain. These backbones and subsequent networks are designed for object detection in the natural image domain. They do not consider some of the critical characteristics of document images. Document images are sparse in contextual information, and the graphical page objects are logically clustered. This paper investigates the effectiveness of deep and robust backbones in the document image domain. Further, it explores the idea of learnable object proposals through Sparse R-CNN. This paper shows that simple domain adaptation of top-performing object detectors to the document image domain does not lead to better results. Furthermore, empirically shows that detectors based on dense object priors like Faster R-CNN, Mask R-CNN, and Cascade Mask R-CNN are perhaps not best suited for graphical page object detection. Detectors that reduce the number of object candidates while making them learnable are a step towards a better approach. We formulate and evaluate the Sparse R-CNN (SR-CNN) model on the IIIT-AR-13k, PubLayNet, and DocBank datasets and hope to inspire a rethinking of object proposals in the domain of graphical page object detection.