Attention is All You Need
https://arxiv.org/abs/1706.03762
(NIPS 2017)
End-to-End Object Detection with Transformers
https://arxiv.org/abs/2005.12872
(ECCV 2020)
Real-Time Scene Text Detection with Differentiable Binarization and Adaptive Scale Fusion
https://arxiv.org/abs/2202.10304
(2022)
SVTR: Scene Text Recognition with a Single Visual Model
https://arxiv.org/abs/2205.00159
(2022)