Mixedbread

mxbai-rerank-large-v2

mxbai-rerank-large-v2 is the flagship model in Mixedbread's second-generation rerank family, delivering state-of-the-art performance across 100+ languages. This reinforcement-learning enhanced 1.5B-parameter model excels at handling long contexts, complex query reasoning, and specialized use cases from code search to e-commerce, all while maintaining impressive processing speed.

Model description

is the flagship model of Mixedbread's second-generation rerank family, a set of state-of-the-art reranking models that are fully open-source under the Apache 2.0 license. This powerful 1.5B-parameter model delivers best-in-class accuracy and robust multilingual capabilities while maintaining impressive processing speed.

The v2 models represent a significant advancement over the first generation, leveraging a sophisticated three-step reinforcement learning process:

  1. GRPO (Guided Reinforcement Prompt Optimization) - Teaching the model to output clear relevance scores
  2. Contrastive Learning - Developing fine-grained understanding of query-document relationships
  3. Preference Learning - Tuning the model to prioritize the most relevant documents

On benchmarks, mxbai-rerank-large-v2 achieves exceptional results with an NDCG@10 score of 57.49 on BEIR average, 29.79 on Mr.TyDi multilingual datasets Despite its larger parameter count, it processes queries with impressive efficiency (0.89s per query on NFC dataset with A100 GPU), up to 8x faster than comparable models.

Parameter CountRecommended Sequence LengthLanguages
1.5B8K (32K-compatible)100+ languages

Key Advantages

  • State-of-the-art performance - Top results across English, multilingual, Chinese, and code search benchmarks
  • Efficient processing - 8x faster than comparable models with similar or larger size
  • Advanced training methodology - Three-stage reinforcement learning process for superior relevance scoring
  • Broad language support - Enabling global applications across diverse linguistic contexts
  • Versatile applications - Excelling beyond document search in specialized domains

When used in combination with a keyword-based search engine such as Elasticsearch, OpenSearch, or Solr, the rerank model can be added to the end of an existing search workflow. This allows users to incorporate semantic relevance into their keyword-based search system without changing the existing infrastructure - an easy, low-complexity method of improving search results with just one line of code.

Limitations

  • Resource Requirements: While more efficient than other models of similar capability, still requires appropriate GPU resources for optimal performance.
  • Sequence Truncation: Documents exceeding the 32K token limit will be truncated, which may result in information loss. Please note that max sequence length is for the query and document combined. It means that len(query) + len(document) should not be longer than 32K tokens.

Examples

Last updated on

On this page