model: add ModernVBERT models#4337
model: add ModernVBERT models#4337paultltc wants to merge 7 commits intoembeddings-benchmark:mainfrom
Conversation
|
@paultltc , I can run inference on vidore v1-3 when the PR is merged |
| torch_dtype=torch.float32, | ||
| trust_remote_code=True, | ||
| ), | ||
| name="ModernVBERT/bimodernvbert", |
There was a problem hiding this comment.
Is it same/similar model as ModernVBERT/modernvbert-embed?
There was a problem hiding this comment.
BiModernVBERT is the doc specialization, *-embed is the generalist one. If it is too confusing regarding naming, we can only add ColModernVBERT in this PR.
There was a problem hiding this comment.
you mean both ColModernVBERT and BiModernVBERT?
There was a problem hiding this comment.
ModernVBERT/bimodernvbert and ModernVBERT/modernvbert-embed
| if "torch_dtype" in kwargs: | ||
| self.mdl.to(kwargs["torch_dtype"]) |
There was a problem hiding this comment.
Shouldn't this happen in super?
|
@paultltc can I ask you to finish the checklist and also run the models on a sample task? |
Sure, working on it this morning but should be done. |
| # Generalist model, trained for single-vector embeddings. | ||
| # Should be specialized for specific tasks for best performance. | ||
| modernvbert_embed = ModelMeta( | ||
| loader=BiModernVBertWrapper, |
There was a problem hiding this comment.
If this for single vector should it use BiModernVBertWrapper wrapper? Maybe change to sentence transformers?
|
|
||
| # Document specific model, trained for single-vector retrieval | ||
| bimodernvbert = ModelMeta( | ||
| loader=BiModernVBertWrapper, |
There was a problem hiding this comment.
Should this loader changed to sentence transformers, because this model for dense retrieval?
There was a problem hiding this comment.
It could yes but haven't tried. How would it work in mteb? Default loader would use ST right?
There was a problem hiding this comment.
Default loader would use ST right?
Yes
|
This pull request has been automatically marked as stale due to inactivity. |
Close #3245
mteb.get_model(model_name, revision)andmteb.get_model_meta(model_name, revision)@QuentinJGMace @ManuelFay @mlconti1
In progress!