model: add ModernVBERT models by paultltc · Pull Request #4337 · embeddings-benchmark/mteb

paultltc · 2026-03-31T13:38:38Z

I have filled out the ModelMeta object to the extent possible
I have ensured that my model can be loaded using
- mteb.get_model(model_name, revision) and
- mteb.get_model_meta(model_name, revision)
I have tested the implementation works on a representative set of tasks.
The model is public, i.e., is available either as an API or the weights are publicly available to download

In progress!

QuentinJGMace · 2026-03-31T14:00:33Z

@paultltc , I can run inference on vidore v1-3 when the PR is merged

Samoed · 2026-03-31T14:22:32Z

+        torch_dtype=torch.float32,
+        trust_remote_code=True,
+    ),
+    name="ModernVBERT/bimodernvbert",


Is it same/similar model as ModernVBERT/modernvbert-embed?

BiModernVBERT is the doc specialization, *-embed is the generalist one. If it is too confusing regarding naming, we can only add ColModernVBERT in this PR.

Can we keep both?

you mean both ColModernVBERT and BiModernVBERT?

ModernVBERT/bimodernvbert and ModernVBERT/modernvbert-embed

KennethEnevoldsen · 2026-04-05T13:55:58Z

+        if "torch_dtype" in kwargs:
+            self.mdl.to(kwargs["torch_dtype"])


Shouldn't this happen in super?

KennethEnevoldsen · 2026-04-05T13:57:36Z

@paultltc can I ask you to finish the checklist and also run the models on a sample task?

paultltc · 2026-04-07T07:39:42Z

@paultltc can I ask you to finish the checklist and also run the models on a sample task?

Sure, working on it this morning but should be done.
@QuentinJGMace will take care of running it on ViDoRe benchmarks as I don't have access to compute anymore!

Samoed · 2026-04-07T08:08:04Z

+# Generalist model, trained for single-vector embeddings.
+# Should be specialized for specific tasks for best performance.
+modernvbert_embed = ModelMeta(
+    loader=BiModernVBertWrapper,


If this for single vector should it use BiModernVBertWrapper wrapper? Maybe change to sentence transformers?

Samoed · 2026-04-07T08:10:01Z

+
+# Document specific model, trained for single-vector retrieval
+bimodernvbert = ModelMeta(
+    loader=BiModernVBertWrapper,


Should this loader changed to sentence transformers, because this model for dense retrieval?

It could yes but haven't tried. How would it work in mteb? Default loader would use ST right?

Default loader would use ST right?

Yes

github-actions · 2026-04-22T02:20:46Z

This pull request has been automatically marked as stale due to inactivity.

add modeling and tests

89f56d2

Samoed reviewed Mar 31, 2026

View reviewed changes

paultltc added 5 commits March 31, 2026 15:16

update metadata + remove tests

11377d1

update modeling

53b0ea6

fix double augmentation

611c27b

rollback

ee5c197

lint

1dfd707

Samoed added the new model Questions related to adding a new model to the benchmark label Apr 1, 2026

KennethEnevoldsen marked this pull request as ready for review April 5, 2026 13:54

KennethEnevoldsen changed the title ~~feat: add ModernVBERT models~~ model: add ModernVBERT models Apr 5, 2026

KennethEnevoldsen reviewed Apr 5, 2026

View reviewed changes

clean code + add metadata for modernvbert-embed

4967a8f

Samoed reviewed Apr 7, 2026

View reviewed changes

github-actions Bot added the stale label Apr 22, 2026

		if "torch_dtype" in kwargs:
		self.mdl.to(kwargs["torch_dtype"])

Conversation

paultltc commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

QuentinJGMace commented Mar 31, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paultltc Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KennethEnevoldsen commented Apr 5, 2026

Uh oh!

paultltc commented Apr 7, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paultltc Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

paultltc commented Mar 31, 2026 •

edited

Loading

paultltc Mar 31, 2026 •

edited

Loading

paultltc Apr 7, 2026 •

edited

Loading