- mixedbread.ai - blog

NLP is a fast-moving field. We want to share our insights and continue to learn with you. Writing about research, software, and more.

64 bytes per embedding, yee-haw 🤠

Binary MRL combines two popular approaches to deal with the scalability issues of embeddings. It helps our embedding model achieve a 64x gain in efficiency while retaining more than 90% of performance, drastically reducing infrastructure costs and enabling new applications.

April 12, 202410 min read

ColBERTus Maximus - Introducing mxbai-colbert-large-v1

mxbai-colbert-large-v1 is a state-of-the-art ColBERT model for reranking and retrieval tasks. It is based on the mxbai-embed-large-v1 model and achieves state-of-the-art performance on 13 publicly available BEIR benchmarks. It's available on HuggingFace.

March 19, 20248 min read

Open Source Strikes Bread - New Fluffy Embedding Model

Our English embedding model provides state-of-the-art performance among other efficiently sized models. It outperforms closed source models like OpenAI's text-embedding-v3.

March 8, 20248 min read

Fresh 2D-Matryoshka Embedding Model

The 2D-🪆 model introduces a novel approach that enables you to reduce both the number of layers and the dimensions of embeddings within the model. This dual reduction strategy allows for a more compact model size while still delivering competitive performance compared to leading models, such as Nomic's embedding model. Specifically, reducing the model's layers by approximately 50% retains up to 85% of its original performance, even without additional training.

March 4, 202411 min read

Boost Your Search With The Crispy mixedbread Rerank Models

Introducing mixedbread rerank models - Upgrade your search results with our new, open-source reranking models from mixedbread. These models, available in three sizes, make it easier to find relevant results by adding a semantic layer to existing search systems. They're simple to use, work with your current setup, and are proven to boost performance with many traditional and semantic search models. Check them out for a more accurate, efficient search experience.

February 29, 202414 min read