Papers (6)
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
arXiv:1810.04805 · NAACL 2019
POSTREPRODUCEDPRE41% - Big Bird: Transformers for Longer Sequences
arXiv:2007.14062 · NeurIPS 2020
POSTPARTIALPREpending - An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
arXiv:2010.11929 · ICLR 2021
POSTREPRODUCEDPRE41% - Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
arXiv:1910.10683 · JMLR 2020
POSTREPRODUCEDPREpending - ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
arXiv:1909.11942 · ICLR 2020
POSTREPRODUCEDPREpending - PaLM 2 Technical Report
arXiv:2305.10403 · arXiv preprint
POSTPENDINGPRE41%