{"paper":{"arxiv_id":"1908.10084","title":"Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks","abstract":"BERT and RoBERTa have set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity. However, both require feeding both sentences into the network, which produces a massive computational overhead. Sentence-BERT is a modification of the pretrained BERT network that uses siamese and triplet network structures to derive semantically meaningful sentence embeddings.","primary_category":"cs.CL","venue":"EMNLP 2019","published_at":null,"latest_version":1,"withdrawn":false},"latest_version":{"id":"e3db96d1-1517-4004-aaab-3bdd13036b0b","version":1,"source_url":"https://arxiv.org/abs/1908.10084","rendered_html_url":null,"rendering_engine":null},"verdict":{"id":"4516c3a0-f139-48a2-bcf9-adcf6a6c80b2","kind":"POST","status":"reproduced","score":0.8495652385297289,"confidence":0.8,"agent_version":"v0.2.0-sbert-stsb-test-3slice-table2","computed_at":"2026-05-14T23:48:16.102Z","is_current":true,"claim_citation":{"paper_arxiv_id":"1908.10084","section":"Table 2","row":"SBERT-NLI-STSb-base (trained on NLI + STSb)","column":"STSb test Spearman correlation","reported_value":85.35,"reported_metric":"spearman","quoted_text":"85.35","pdf_page":6,"notes":"Table 2 of arXiv:1908.10084 reports SBERT-NLI-STSb-base STSb test Spearman = 85.35 ± 0.17 (supervised NLI+STSb variant). Driver evaluates `sentence-transformers/bert-base-nli-stsb-mean-tokens` on a 3-slice STSb test micro-slice. PROTOCOL_MATCH is `proxy` (dataset-size). NB: Table 1 (unsupervised NLI-only baseline) is NOT the right comparator for this checkpoint — see `docs/red-team/2026-05-13-summary.md`."},"protocol_match":"proxy"},"verdicts":{"post":{"id":"4516c3a0-f139-48a2-bcf9-adcf6a6c80b2","kind":"POST","status":"reproduced","score":0.8495652385297289,"confidence":0.8,"agent_version":"v0.2.0-sbert-stsb-test-3slice-table2","computed_at":"2026-05-14T23:48:16.102Z","is_current":true,"claim_citation":{"paper_arxiv_id":"1908.10084","section":"Table 2","row":"SBERT-NLI-STSb-base (trained on NLI + STSb)","column":"STSb test Spearman correlation","reported_value":85.35,"reported_metric":"spearman","quoted_text":"85.35","pdf_page":6,"notes":"Table 2 of arXiv:1908.10084 reports SBERT-NLI-STSb-base STSb test Spearman = 85.35 ± 0.17 (supervised NLI+STSb variant). Driver evaluates `sentence-transformers/bert-base-nli-stsb-mean-tokens` on a 3-slice STSb test micro-slice. PROTOCOL_MATCH is `proxy` (dataset-size). NB: Table 1 (unsupervised NLI-only baseline) is NOT the right comparator for this checkpoint — see `docs/red-team/2026-05-13-summary.md`."},"protocol_match":"proxy"},"pre":{"id":"e0ae0051-58af-431c-9c78-9570a60b87e3","kind":"PRE","status":"pending","score":0.4134,"confidence":0.5,"agent_version":"pre-heuristic-v0.1+no-llm","computed_at":"2026-05-06T17:20:48.049Z","is_current":true,"claim_citation":null,"protocol_match":null}}}