Find a paper
20 results for “BERT”
- Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks1908.10084EMNLP 2019 · cs.CL
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding1810.04805NAACL 2019 · cs.CL
- A Universal Reproducing Kernel Hilbert Space from Polynomial Alignment and IMQ Distance2605.03262cs.LG
- Gaussian mixture models in Hilbert spaces via kernel methods2605.05996stat.ML
- Learning Reconstructive Embeddings in Reproducing Kernel Hilbert Spaces via the Representer Theorem2601.05811cs.LG
- DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter1910.01108NeurIPS 2019 EMC^2 Workshop · cs.CL
- ALBERT: A Lite BERT for Self-supervised Learning of Language Representations1909.11942ICLR 2020 · cs.CL
- Consistent Geometric Deep Learning via Hilbert Bundles and Cellular Sheaves2605.06395cs.LG
- RoBERTa: A Robustly Optimized BERT Pretraining Approach1907.11692arXiv preprint · cs.CL
- DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing2111.09543ICLR 2023 · cs.CL
- A Behavioral Framework for Data-Driven Modeling of Nonlinear Systems in Vector-Valued Reproducing Kernel Hilbert Spaces2605.07052eess.SY
- CodeBERT: A Pre-Trained Model for Programming and Natural Languages2002.08155EMNLP Findings 2020 · cs.CL
- MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices2004.02984ACL 2020 · cs.CL
- DeBERTa: Decoding-enhanced BERT with Disentangled Attention2006.03654ICLR 2021 · cs.CL
- ConRetroBert: EMA Stabilized Dual Encoders for Template-Based Single-Step Retrosynthesis2605.12736cs.LG
- Stock Market Prediction Using Node Transformer Architecture Integrated with BERT Sentiment Analysis2603.05917cs.LG
- Filter-then-Verify: A Multiphase GNN and ModernBERT Framework for Social Engineering Detection in Email Networks2605.17201cs.CR
- BERTO: Intent-Driven Network Time Series Forecasting via Natural Language Operator Preferences2512.05721cs.LG
- Towards Solving the Gilbert-Pollak Conjecture via Large Language Models2601.22365cs.DM
- Complex Stochastic Gradient Descent and Directional Bias in Reproducing Kernel Hilbert Spaces2604.23017cs.LG