Mixedbread

Data Models

Understanding the core data structures in Mixedbread Stores helps you work effectively with the API and understand how your content is organized and retrieved.

StoreLink to section

A Store is the primary container for your searchable content. It holds your files, manages access permissions, and provides the foundation for semantic search operations.

Store PropertiesLink to section

PropertyTypeDescription
idstringUnique identifier for the Store
namestringUser-defined name that serves as an identifier
descriptionstringOptional description of the Store's purpose
is_publicbooleanWhether the Store is publicly accessible
metadataobjectAdditional metadata associated with the Store
file_countsobjectCounts of files in different processing states
expires_afterobjectExpiration configuration based on activity
statusenumCurrent status: expired, in_progress, completed
created_atstringISO timestamp when the Store was created
updated_atstringISO timestamp when the Store was last updated
last_active_atstringISO timestamp of the last activity
usage_bytesintegerTotal storage space used by indexed content
expires_atstringComputed expiration timestamp (if expires_after is set)
objectstringAlways "store"

File Counts ObjectLink to section

The file_counts object provides detailed breakdown of file processing states:

PropertyTypeDescription
pendingintegerNumber of files waiting to be processed
in_progressintegerNumber of files currently being processed
cancelledintegerNumber of files whose processing was cancelled
completedintegerNumber of successfully processed files
failedintegerNumber of files that failed processing
totalintegerTotal number of files

For detailed configuration options including expiration policies and public access, see .

Store ExampleLink to section

{
  "id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
  "name": "product-documentation",
  "description": "Complete product documentation and API reference",
  "is_public": false,
  "metadata": {
    "category": "documentation",
    "language": "en"
  },
  "file_counts": {
    "pending": 2,
    "in_progress": 1,
    "cancelled": 0,
    "completed": 10,
    "failed": 0,
    "total": 13
  },
  "expires_after": {
    "anchor": "last_active_at",
    "days": 30
  },
  "status": "in_progress",
  "created_at": "2024-01-15T10:00:00Z",
  "updated_at": "2024-01-20T14:30:00Z",
  "last_active_at": "2024-01-20T14:30:00Z",
  "usage_bytes": 1048576,
  "expires_at": "2024-02-19T14:30:00Z",
  "object": "store"
}

Store FileLink to section

A Store File represents a complete file that you've uploaded to a Store. It tracks the file's processing status, metadata, and relationship to the searchable chunks created from its content.

File PropertiesLink to section

PropertyTypeDescription
idstringUnique identifier for the file within the Store
filenamestringOriginal name of the uploaded file
metadataobjectCustom key-value pairs you've attached to the file
statusenumCurrent processing status of the file
last_errorobjectDetails about any processing errors that occurred
store_idstringID of the Store containing this file
created_atstringISO timestamp when the file was added to the Store
versionintegerVersion number of the file within the Store
usage_bytesintegerStorage space used by the file's indexed data
objectstringAlways "store.file"

For detailed information on file processing lifecycle and status meanings, see .

For guidance on metadata structure and types, see .

File ExampleLink to section

{
  "id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "filename": "product-documentation.pdf",
  "metadata": {
    "category": "documentation",
    "department": "product",
    "version": "2.1",
    "last_updated": "2024-01-15"
  },
  "status": "completed",
  "last_error": null,
  "store_id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
  "created_at": "2024-01-15T10:30:00Z",
  "version": 1,
  "usage_bytes": 245760,
  "object": "store.file"
}

Store ChunkLink to section

A Store Chunk represents a searchable segment of content created from a Store File. When you search, you get back chunks that contain the most relevant portions of your files.

Chunk PropertiesLink to section

PropertyTypeDescription
chunk_indexintegerPosition of this chunk within the source file
mime_typestringContent type of the chunk (text/plain, image/png, etc.)
modelstringModel used to generate the chunk's vector
scorenumberRelevance score for this chunk (in search results)
file_idstringID of the file this chunk came from
filenamestringName of the source file
store_idstringID of the Store containing this chunk
external_idstringOptional external identifier for the source file
metadataobjectUser-defined metadata inherited from the source file
generated_metadataobjectIngestion-time structured metadata e.g. chunk size
typeenumType of content: text, image_url, audio_url, video_url

Content-Specific PropertiesLink to section

Text Chunks

PropertyTypeDescription
textstringText content of the chunk
offsetintegerCharacter offset of this chunk relative to the start of the file

Image Chunks

PropertyTypeDescription
image_urlobjectImage URL and format information
ocr_textstringText extracted from images via OCR
summarystringAI-generated summary of the image content †

Audio Chunks

PropertyTypeDescription
audio_urlobjectAudio URL and format information
transcriptionstringSpeech-to-text transcription of the audio †
sampling_rateintegerAudio sampling rate in Hz

Video Chunks

PropertyTypeDescription
video_urlobjectVideo URL and format information
transcriptionstringSpeech-to-text transcription of the video †

The summary and transcription fields are only populated when the file was ingested with the high_quality .

Chunk TypesLink to section

Text Chunks

{
  "type": "text",
  "text": "User authentication in our API requires a valid API key...",
  "chunk_index": 2,
  "offset": 1024,
  "mime_type": "text/plain",
  "score": 0.89
}

Image Chunks

{
  "type": "image_url",
  "image_url": {
    "url": "https://signed-url-to-image.com/chunk_img_123",
    "format": "png"
  },
  "ocr_text": "Figure 1: Authentication Flow Diagram",
  "summary": "A diagram showing the authentication flow process",
  "chunk_index": 5,
  "mime_type": "image/png",
  "score": 0.76
}

Audio Chunks

{
  "type": "audio_url",
  "audio_url": {
    "url": "https://signed-url-to-audio.com/chunk_audio_456"
  },
  "transcription": "Welcome to our product overview. In this section, we'll cover...",
  "sampling_rate": 44100,
  "chunk_index": 3,
  "mime_type": "audio/mpeg",
  "score": 0.82
}

Video Chunks

{
  "type": "video_url",
  "video_url": {
    "url": "https://signed-url-to-video.com/chunk_video_789"
  },
  "transcription": "Hello everyone, today we're going to demonstrate...",
  "chunk_index": 1,
  "mime_type": "video/mp4",
  "score": 0.88
}

Complete Chunk ExampleLink to section

{
  "chunk_index": 3,
  "mime_type": "text/plain",
  "model": "mixedbread-ai/mxbai-omni-v1",
  "score": 0.92,
  "file_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "filename": "product-documentation.pdf",
  "store_id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
  "external_id": "doc-auth-guide-v2",
  "metadata": {
    "category": "documentation",
    "department": "product"
  },
  "type": "text",
  "text": "To authenticate API requests, include your API key in the Authorization header: Authorization: Bearer YOUR_API_KEY. The API key identifies your account and provides access to your organization's resources.",
  "offset": 4096
}
Last updated: April 7, 2026