Parsing Jobs
API reference for managing document parsing jobs in Mixedbread. Covers creating, retrieving, listing, deleting, and canceling parsing jobs.
Create Parsing Job
POST/v1/parsing/jobs
Starts a new asynchronous job to parse a specified file. The job will process the file based on the provided parameters.
- file_id
file_id
- Type
- string
- Required or Optional
- required
- Description
The ID of the file (previously uploaded via the Files API) to be parsed.
- element_types
element_types
- Type
- string[]
- Required or Optional
- optional
- Description
Specifies which types of elements to extract from the document. If omitted, default element types may be used.
- Options: caption, footnote, formula, list-item, page-footer, page-header, picture, section-header, table, text, title
- chunking_strategy
chunking_strategy
- Type
- string
- Required or Optional
- optional
- Description
The strategy used for chunking the document content. Currently, only 'page' might be supported. Defaults may apply.
- Options: page
- return_format
return_format
- Type
- string
- Required or Optional
- optional
- Description
The desired format for the extracted content within the job result. Defaults may apply.
- Options: html, markdown, plain
Request Body
Response Body
- id
id
- Type
- string
- Required or Optional
- required
- Description
- Unique identifier for the newly created parsing job.
- file_id
file_id
- Type
- string
- Required or Optional
- required
- Description
- ID of the file being parsed.
- status
status
- Type
- string
- Required or Optional
- required
- Description
- Current status of the job (e.g., pending, in_progress, completed, failed, cancelled).
- Initial status: pending
- error
error
- Type
- object
- Required or Optional
- optional
- Description
- Will contain error details if the job status is 'failed'. Null otherwise.
- result
result
- Type
- object
- Required or Optional
- optional
- Description
Will contain the parsing results once the job status is 'completed'. Null otherwise.
- chunking_strategy
chunking_strategy
- Type
- string
- Required or Optional
- required
- Description
- The chunking strategy used.
- return_format
return_format
- Type
- string
- Required or Optional
- required
- Description
- The format of the content in the result.
- element_types
element_types
- Type
- string[]
- Required or Optional
- required
- Description
- The element types that were extracted.
- chunks
chunks
- Type
- object[]
- Required or Optional
- required
- Description
List of extracted chunks.
- content
content
- Type
- string
- Required or Optional
- required
- Description
- Full content of the chunk.
- content_to_embed
content_to_embed
- Type
- string
- Required or Optional
- required
- Description
- Content suitable for embedding.
- elements
elements
- Type
- object[]
- Required or Optional
- required
- Description
List of elements within this chunk.
- type
type
- Type
- string
- Required or Optional
- required
- Description
- Type of the element.
- confidence
confidence
- Type
- number
- Required or Optional
- required
- Description
- Extraction confidence score.
- bbox
bbox
- Type
- number[]
- Required or Optional
- required
- Description
- Bounding box [x1, y1, x2, y2].
- page
page
- Type
- integer
- Required or Optional
- required
- Description
- Page number.
- content
content
- Type
- string
- Required or Optional
- required
- Description
- Full content of the element.
- summary
summary
- Type
- string
- Required or Optional
- optional
- Description
- Optional summary of the element.
- page_sizes
page_sizes
- Type
- number[][]
- Required or Optional
- optional
- Description
- List of [width, height] tuples for each page.
- started_at
started_at
- Type
- string
- Required or Optional
- optional
- Description
- Timestamp when the job started processing.
- finished_at
finished_at
- Type
- string
- Required or Optional
- optional
- Description
- Timestamp when the job finished processing.
- created_at
created_at
- Type
- string
- Required or Optional
- required
- Description
- Timestamp when the job was created.
- updated_at
updated_at
- Type
- string
- Required or Optional
- optional
- Description
- Timestamp when the job was last updated.
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "parsing_job".
Retrieve Parsing Job
GET/v1/parsing/jobs/{job_id}
Retrieves the current status and details of a specific parsing job, including the result if completed.
- job_id
job_id
- Type
- string
- Required or Optional
- required
- Description
The unique identifier of the parsing job to retrieve.
Path Parameters
Response Body
- id
id
- Type
- string
- Required or Optional
- required
- Description
- Job identifier.
- file_id
file_id
- Type
- string
- Required or Optional
- required
- Description
- Parsed file ID.
- status
status
- Type
- string
- Required or Optional
- required
- Description
- Current status.
- error
error
- Type
- object
- Required or Optional
- optional
- Description
- Error details if failed.
- result
result
- Type
- object
- Required or Optional
- optional
- Description
- Parsing results if completed.
- started_at
started_at
- Type
- string
- Required or Optional
- optional
- Description
- Start timestamp.
- finished_at
finished_at
- Type
- string
- Required or Optional
- optional
- Description
- Finish timestamp.
- created_at
created_at
- Type
- string
- Required or Optional
- required
- Description
- Creation timestamp.
- updated_at
updated_at
- Type
- string
- Required or Optional
- optional
- Description
- Last update timestamp.
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "parsing_job".
List Parsing Jobs
GET/v1/parsing/jobs
Retrieves a list of parsing jobs associated with your account, supporting pagination. Note that the results (result
and error
fields) are omitted in the list view for brevity.
- limit
limit
- Type
- integer
- Required or Optional
- optional
- Description
Maximum number of jobs to return.
- Default: 20, Max: 100
- offset
offset
- Type
- integer
- Required or Optional
- optional
- Description
Number of jobs to skip.
- Default: 0
Query Parameters
Response Body
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "list".
- data
data
- Type
- JobListResponse[]
- Required or Optional
- required
- Description
A list of parsing job objects (excluding
result
anderror
).- id
id
- Type
- string
- Required or Optional
- required
- Description
- Job identifier.
- file_id
file_id
- Type
- string
- Required or Optional
- required
- Description
- Parsed file ID.
- status
status
- Type
- string
- Required or Optional
- required
- Description
- Current status.
- started_at
started_at
- Type
- string
- Required or Optional
- optional
- Description
- Start timestamp.
- finished_at
finished_at
- Type
- string
- Required or Optional
- optional
- Description
- Finish timestamp.
- created_at
created_at
- Type
- string
- Required or Optional
- required
- Description
- Creation timestamp.
- updated_at
updated_at
- Type
- string
- Required or Optional
- optional
- Description
- Last update timestamp.
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "parsing_job".
- pagination
pagination
- Type
- object
- Required or Optional
- required
- Description
Pagination information.
- total
total
- Type
- integer
- Required or Optional
- optional
- Description
- Total number of jobs available.
- offset
offset
- Type
- integer
- Required or Optional
- optional
- Description
- The offset used for this page.
Delete Parsing Job
DELETE/v1/parsing/jobs/{job_id}
Deletes a specific parsing job record. This does not delete the original file or the parsed results if stored elsewhere (e.g., in a vector store).
- job_id
job_id
- Type
- string
- Required or Optional
- required
- Description
The unique identifier of the parsing job to delete.
Path Parameters
Response Body
- id
id
- Type
- string
- Required or Optional
- required
- Description
- The ID of the deleted job.
- deleted
deleted
- Type
- boolean
- Required or Optional
- required
- Description
- Indicates if the deletion was successful.
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "parsing_job".
Cancel Parsing Job
PATCH/v1/parsing/jobs/{job_id}
Attempts to cancel a parsing job that is currently pending
or in_progress
. If successful, the job status will transition to cancelled
.
- job_id
job_id
- Type
- string
- Required or Optional
- required
- Description
The unique identifier of the parsing job to cancel.
Path Parameters
Response Body
- id
id
- Type
- string
- Required or Optional
- required
- Description
- Job identifier.
- file_id
file_id
- Type
- string
- Required or Optional
- required
- Description
- Parsed file ID.
- status
status
- Type
- string
- Required or Optional
- required
- Description
- Current status.
- Should be 'cancelled' if successful
- error
error
- Type
- object
- Required or Optional
- optional
- Description
- Error details if cancellation failed or job already finished.
- result
result
- Type
- object
- Required or Optional
- optional
- Description
- Parsing results if completed before cancel.
- started_at
started_at
- Type
- string
- Required or Optional
- optional
- Description
- Start timestamp.
- finished_at
finished_at
- Type
- string
- Required or Optional
- optional
- Description
- Finish timestamp (set upon cancellation).
- created_at
created_at
- Type
- string
- Required or Optional
- required
- Description
- Creation timestamp.
- updated_at
updated_at
- Type
- string
- Required or Optional
- optional
- Description
- Last update timestamp.
- object
object
- Type
- string
- Required or Optional
- required
- Description
- Always "parsing_job".
Last updated on
Files
API reference for managing files in Mixedbread. Covers uploading, retrieving, updating, listing, deleting, and downloading file content.
Reranking
API reference for Mixedbread's Reranking endpoint. This documentation covers request parameters, response structure, and includes examples for reordering documents based on their relevance to a given query.