Data Models
Understanding the core data structures in Mixedbread Stores helps you work effectively with the API and understand how your content is organized and retrieved.
Store
A Store is the primary container for your searchable content. It holds your files, manages access permissions, and provides the foundation for semantic search operations.
Store Properties
| Property | Type | Description |
|---|---|---|
id | string | Unique identifier for the Store |
name | string | User-defined name that serves as an identifier |
description | string | Optional description of the Store's purpose |
is_public | boolean | Whether the Store is publicly accessible |
metadata | object | Additional metadata associated with the Store |
file_counts | object | Counts of files in different processing states |
expires_after | object | Expiration configuration based on activity |
status | enum | Current status: expired, in_progress, completed |
created_at | string | ISO timestamp when the Store was created |
updated_at | string | ISO timestamp when the Store was last updated |
last_active_at | string | ISO timestamp of the last activity |
usage_bytes | integer | Total storage space used by indexed content |
expires_at | string | Computed expiration timestamp (if expires_after is set) |
object | string | Always "store" |
File Counts Object
The file_counts object provides detailed breakdown of file processing states:
| Property | Type | Description |
|---|---|---|
pending | integer | Number of files waiting to be processed |
in_progress | integer | Number of files currently being processed |
cancelled | integer | Number of files whose processing was cancelled |
completed | integer | Number of successfully processed files |
failed | integer | Number of files that failed processing |
total | integer | Total number of files |
For detailed configuration options including expiration policies and public access, see Store Configuration.
Store Example
{
"id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
"name": "product-documentation",
"description": "Complete product documentation and API reference",
"is_public": false,
"metadata": {
"category": "documentation",
"language": "en"
},
"file_counts": {
"pending": 2,
"in_progress": 1,
"cancelled": 0,
"completed": 10,
"failed": 0,
"total": 13
},
"expires_after": {
"anchor": "last_active_at",
"days": 30
},
"status": "in_progress",
"created_at": "2024-01-15T10:00:00Z",
"updated_at": "2024-01-20T14:30:00Z",
"last_active_at": "2024-01-20T14:30:00Z",
"usage_bytes": 1048576,
"expires_at": "2024-02-19T14:30:00Z",
"object": "store"
}Store File
A Store File represents a complete file that you've uploaded to a Store. It tracks the file's processing status, metadata, and relationship to the searchable chunks created from its content.
File Properties
| Property | Type | Description |
|---|---|---|
id | string | Unique identifier for the file within the Store |
filename | string | Original name of the uploaded file |
metadata | object | Custom key-value pairs you've attached to the file |
status | enum | Current processing status of the file |
last_error | object | Details about any processing errors that occurred |
store_id | string | ID of the Store containing this file |
created_at | string | ISO timestamp when the file was added to the Store |
version | integer | Version number of the file within the Store |
usage_bytes | integer | Storage space used by the file's indexed data |
object | string | Always "store.file" |
For detailed information on file processing lifecycle and status meanings, see Store File Status.
For guidance on metadata structure and types, see Metadata Types.
File Example
{
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"filename": "product-documentation.pdf",
"metadata": {
"category": "documentation",
"department": "product",
"version": "2.1",
"last_updated": "2024-01-15"
},
"status": "completed",
"last_error": null,
"store_id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
"created_at": "2024-01-15T10:30:00Z",
"version": 1,
"usage_bytes": 245760,
"object": "store.file"
}Store Chunk
A Store Chunk represents a searchable segment of content created from a Store File. When you search, you get back chunks that contain the most relevant portions of your files.
Chunk Properties
| Property | Type | Description |
|---|---|---|
chunk_index | integer | Position of this chunk within the source file |
mime_type | string | Content type of the chunk (text/plain, image/png, etc.) |
model | string | Model used to generate the chunk's vector |
score | number | Relevance score for this chunk (in search results) |
file_id | string | ID of the file this chunk came from |
filename | string | Name of the source file |
store_id | string | ID of the Store containing this chunk |
metadata | object | Metadata inherited from the source file |
type | enum | Type of content: text, image_url, audio_url, video_url |
Content-Specific Properties
Text Chunks
| Property | Type | Description |
|---|---|---|
text | string | Text content of the chunk |
Image Chunks
| Property | Type | Description |
|---|---|---|
image_url | object | Image URL and format information |
ocr_text | string | Text extracted from images via OCR |
summary | string | AI-generated summary of the image content |
Audio Chunks
| Property | Type | Description |
|---|---|---|
audio_url | object | Audio URL and format information |
transcription | string | Speech-to-text transcription of the audio |
summary | string | AI-generated summary of the audio content |
Video Chunks
| Property | Type | Description |
|---|---|---|
video_url | object | Video URL and format information |
transcription | string | Speech-to-text transcription of the video |
summary | string | AI-generated summary of the video content |
Chunk Types
Text Chunks
{
"type": "text",
"text": "User authentication in our API requires a valid API key...",
"chunk_index": 2,
"mime_type": "text/plain",
"score": 0.89
}Image Chunks
{
"type": "image_url",
"image_url": {
"url": "https://signed-url-to-image.com/chunk_img_123",
"format": "png"
},
"ocr_text": "Figure 1: Authentication Flow Diagram",
"summary": "A diagram showing the authentication flow process",
"chunk_index": 5,
"mime_type": "image/png",
"score": 0.76
}Audio Chunks
{
"type": "audio_url",
"audio_url": {
"url": "https://signed-url-to-audio.com/chunk_audio_456"
},
"transcription": "Welcome to our product overview. In this section, we'll cover...",
"summary": "Product overview introduction discussing key features",
"chunk_index": 3,
"mime_type": "audio/mpeg",
"score": 0.82
}Video Chunks
{
"type": "video_url",
"video_url": {
"url": "https://signed-url-to-video.com/chunk_video_789"
},
"transcription": "Hello everyone, today we're going to demonstrate...",
"summary": "Product demonstration video showing core functionality",
"chunk_index": 1,
"mime_type": "video/mp4",
"score": 0.88
}Complete Chunk Example
{
"chunk_index": 3,
"mime_type": "text/plain",
"model": "mixedbread-ai/mxbai-omni-v1",
"score": 0.92,
"file_id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"filename": "product-documentation.pdf",
"store_id": "c3d4e5f6-a7b8-9012-cdef-345678901234",
"metadata": {
"category": "documentation",
"department": "product"
},
"type": "text",
"text": "To authenticate API requests, include your API key in the Authorization header: Authorization: Bearer YOUR_API_KEY. The API key identifies your account and provides access to your organization's resources."
}