Papers (4)
- DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
arXiv:2111.09543 · ICLR 2023
POSTREPRODUCEDPREpending - LoRA: Low-Rank Adaptation of Large Language Models
arXiv:2106.09685 · ICLR 2022
POSTREPRODUCEDPREpending - Textbooks Are All You Need II: phi-1.5 technical report
arXiv:2309.05463 · arXiv 2023
POSTPARTIALPREpending - Deep Residual Learning for Image Recognition
arXiv:1512.03385 · CVPR 2016
POSTPARTIALPREpending