Mixedbread
Search

Search

Search your Store using natural language to find exactly what you need. Stores understand the meaning behind your queries, not just keywords, making it perfect for conversational search and complex questions.

Search for chunks of content across your Store:

Search Store
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

response = mxbai.stores.search(
    query="How does authentication work?",
    store_identifiers=["my-knowledge-base"],
    top_k=5,
)

for chunk in response.data:
    print(chunk)

When you search, Stores understand your natural language query and find the most relevant content across all your files. Results are automatically ranked by relevance with confidence scores.

For complete details on chunk object structure including all content types and properties, see Data Models.

Search OptionsLink to section

Top-k ResultsLink to section

Control the number of results returned:

Top-k Results
results1 = mxbai.stores.search(
  query="API authentication methods",
  store_identifiers=["docs"],
  top_k=5
)

results2 = mxbai.stores.search(
  query="deployment strategies",
  store_identifiers=["docs"],
  top_k=20
)

Optimization Tips:

  • Start with top_k=10 for most use cases
  • Increase for comprehensive searches
  • Decrease for faster response times

FilterLink to section

Filter search results to narrow down your search scope. Store search supports two types of filtering:

Metadata Filtering
results = mxbai.stores.search(
  query="user authentication",
  store_identifiers=["docs"],
  filters={
      "all": [
          {"key": "category", "operator": "eq", "value": "documentation"},
          {"key": "difficulty", "operator": "in", "value": ["beginner", "intermediate"]}
      ]
}
)

For complete metadata filtering capabilities and advanced patterns, see Metadata Filtering.

File Filtering
results1 = mxbai.stores.search(
  query="authentication methods",
  store_identifiers=["docs"],
  file_ids=["123e4567-e89b-12d3-a456-426614174000", "123e4567-e89b-12d3-a456-426614174001"]
)

results2 = mxbai.stores.search(
  query="authentication methods",
  store_identifiers=["docs"],
  file_ids=("in", ["123e4567-e89b-12d3-a456-426614174000"])
)

results3 = mxbai.stores.search(
  query="authentication methods",
  store_identifiers=["docs"],
  file_ids=("not_in", ["123e4567-e89b-12d3-a456-426614174000"])
)

RerankLink to section

Improve search result quality by reranking results with specialized models:

Reranking Examples
results = mxbai.stores.search(
  query="API rate limiting best practices",
  store_identifiers=["docs"],
  search_options={
      "rerank": True
  }
)

Reranking applies a second-stage ranking model to improve relevance, especially useful for complex queries or when initial results need refinement. You can use simple boolean (True) or configure advanced options with model selection and metadata inclusion.

Learn more about Rerank for advanced reranking strategies and model options.

Rewrite QueryLink to section

Let a specialized AI model rewrite your query to ge the most optimal semantic search experience. It automatically applies best practices like adding more context and using full semantic qualifiers instead of keywords.

Multi Store Search
results = mxbai.stores.search(
  query="machine learning deployment",
  store_identifiers=[
      "documentation",
      ],
  top_k=15,
  search_options: {
    "rewrite_query": True
  },
)

You can get the rewritten query by calling the /v1/stores/{store_identifier}/events endpoint.

Considerations:

  • The latency increases due to additional agent calls
  • Not every query needs to be rewritten

Search across multiple Stores simultaneously:

Multi Store Search
results = mxbai.stores.search(
  query="machine learning deployment",
  store_identifiers=[
      "documentation",
      "research-papers",
      "tutorials",
      "code-examples"
      ],
  top_k=15
)

Considerations:

  • Results are merged and re-ranked together
  • May need higher top_k for diverse results
  • Different Stores may have different metadata schemas

For complex questions that need multiple sources, an agent can plan, run, and rank a series of searches for you in a single call:

Agentic Search
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

results = mxbai.stores.search(
    query="What are the yearly numbers for 2020, 2021, 2022, 2023, 2024, 2025?",
    store_identifiers=["yearly-reports"],
    search_options={"agentic": True},
)

print(results)

The response shape matches a normal search. See Agentic Search for the full parameter reference, configuration examples, and how to inspect the agent's tool calls and token usage via the events endpoint.

Additional options let you tune the trade-off between latency, result shape, and visual context:

  • strict_top_k (default false): return only chunks the agent considers relevant. Set to true to always return exactly the top_k most relevant results the agent found.
  • media_content (default "auto"): controls when the agent receives actual image content in addition to text fields:
    • "auto": use parsed text and image summaries when available. If an image result has no parsed text or summary, the agent receives the image input.
    • "never": use text fields only. This is the lowest-latency option.
    • "always": include image inputs for retrieved image chunks when available, even when OCR text or summaries exist. This can improve visual reasoning over images and scanned documents, but adds latency.

Search the web using the same API by including mixedbread/web as a store identifier. You can use it standalone or combine it with your own stores for hybrid search that merges web results with your internal knowledge base.

Web Search
from mixedbread import Mixedbread

mxbai = Mixedbread(api_key="YOUR_API_KEY")

results = mxbai.stores.search(
    query="latest AI research papers",
    store_identifiers=["mixedbread/web", "internal-docs"],
    top_k=10,
)

print(results)

For detailed web search capabilities and response format, see Web Store.

Last updated: May 19, 2026