Papers (2)
- RoBERTa: A Robustly Optimized BERT Pretraining Approach
arXiv:1907.11692 · arXiv preprint
POSTREPRODUCEDPREpending - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
arXiv:1910.13461 · ACL 2020
POSTPARTIALPREpending