cross-encoder/ms-marco-TinyBERT-L-2-v2

1年前发布 3 00

Cross-Encoder for MS Marco ...

收录时间:
2025-05-29
cross-encoder/ms-marco-TinyBERT-L-2-v2cross-encoder/ms-marco-TinyBERT-L-2-v2

Cross-Encoder for MS Marco

This model was trained on the MS Marco Passage Ranking task.
The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order. See SBERT.net Retrieve & Re-rank for more details. The training code is available here: SBERT.net Training MS Marco

Usage with Transformers

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained('model_name')
tokenizer = AutoTokenizer.from_pretrained('model_name')
features = tokenizer(['How many people live in Berlin?', 'How many people live in Berlin?'], ['Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers.', 'New York City is famous for the Metropolitan Museum of Art.'], padding=True, truncation=True, return_tensors="pt")
model.eval()
with torch.no_grad():
scores = model(**features).logits
print(scores)

Usage with SentenceTransformers

The usage becomes easier when you have SentenceTransformers installed. Then, you can use the pre-trained models like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('model_name', max_length=512)
scores = model.predict([('Query', 'Paragraph1'), ('Query', 'Paragraph2') , ('Query', 'Paragraph3')])

Performance

In the following table, we provide various pre-trained Cross-Encoders together with their performance on the TREC Deep Learning 2019 and the MS Marco Passage Reranking dataset.

Model-NameNDCG@10 (TREC DL 19)MRR@10 (MS Marco Dev)Docs / Sec
Version 2 models
cross-encoder/ms-marco-TinyBERT-L-2-v269.8432.569000
cross-encoder/ms-marco-MiniLM-L-2-v271.0134.854100
cross-encoder/ms-marco-MiniLM-L-4-v273.0437.702500
cross-encoder/ms-marco-MiniLM-L-6-v274.3039.011800
cross-encoder/ms-marco-MiniLM-L-12-v274.3139.02960
Version 1 models
cross-encoder/ms-marco-TinyBERT-L-267.4330.159000
cross-encoder/ms-marco-TinyBERT-L-468.0934.502900
cross-encoder/ms-marco-TinyBERT-L-669.5736.13680
cross-encoder/ms-marco-electra-base71.9936.41340
Other models
nboost/pt-tinybert-msmarco63.6328.802900
nboost/pt-bert-base-uncased-msmarco70.9434.75340
nboost/pt-bert-large-msmarco73.3636.48100
Capreolus/electra-base-msmarco71.2336.89340
amberoad/bert-multilingual-passage-reranking-msmarco68.4035.54330
sebastian-hofstaetter/distilbert-cat-margin_mse-T2-msmarco72.8237.88720

数据统计

相关导航

没有相关内容!

暂无评论

您必须登录才能参与评论!
立即登录
none
暂无评论...