Mixedbread

Parsing Jobs

API reference for managing document parsing jobs in Mixedbread. Covers creating, retrieving, listing, deleting, and canceling parsing jobs.

Create Parsing Job

POST/v1/parsing/jobs

Starts a new asynchronous job to parse a specified file. The job will process the file based on the provided parameters.

    Request Body

  • file_id
    file_id
    Type
    string
    Required or Optional
    required
    Description

    The ID of the file (previously uploaded via the Files API) to be parsed.

  • element_types
    element_types
    Type
    string[]
    Required or Optional
    optional
    Description

    Specifies which types of elements to extract from the document. If omitted, default element types may be used.

    • Options: caption, footnote, formula, list-item, page-footer, page-header, picture, section-header, table, text, title
  • chunking_strategy
    chunking_strategy
    Type
    string
    Required or Optional
    optional
    Description

    The strategy used for chunking the document content. Currently, only 'page' might be supported. Defaults may apply.

    • Options: page
  • return_format
    return_format
    Type
    string
    Required or Optional
    optional
    Description

    The desired format for the extracted content within the job result. Defaults may apply.

    • Options: html, markdown, plain

Response Body

  • id
    id
    Type
    string
    Required or Optional
    required
    Description
    Unique identifier for the newly created parsing job.
  • file_id
    file_id
    Type
    string
    Required or Optional
    required
    Description
    ID of the file being parsed.
  • status
    status
    Type
    string
    Required or Optional
    required
    Description
    Current status of the job (e.g., pending, in_progress, completed, failed, cancelled).
    • Initial status: pending
  • error
    error
    Type
    object
    Required or Optional
    optional
    Description
    Will contain error details if the job status is 'failed'. Null otherwise.
  • result
    result
    Type
    object
    Required or Optional
    optional
    Description

    Will contain the parsing results once the job status is 'completed'. Null otherwise.

  • chunking_strategy
    chunking_strategy
    Type
    string
    Required or Optional
    required
    Description
    The chunking strategy used.
  • return_format
    return_format
    Type
    string
    Required or Optional
    required
    Description
    The format of the content in the result.
  • element_types
    element_types
    Type
    string[]
    Required or Optional
    required
    Description
    The element types that were extracted.
  • chunks
    chunks
    Type
    object[]
    Required or Optional
    required
    Description

    List of extracted chunks.

  • content
    content
    Type
    string
    Required or Optional
    required
    Description
    Full content of the chunk.
  • content_to_embed
    content_to_embed
    Type
    string
    Required or Optional
    required
    Description
    Content suitable for embedding.
  • elements
    elements
    Type
    object[]
    Required or Optional
    required
    Description

    List of elements within this chunk.

  • type
    type
    Type
    string
    Required or Optional
    required
    Description
    Type of the element.
  • confidence
    confidence
    Type
    number
    Required or Optional
    required
    Description
    Extraction confidence score.
  • bbox
    bbox
    Type
    number[]
    Required or Optional
    required
    Description
    Bounding box [x1, y1, x2, y2].
  • page
    page
    Type
    integer
    Required or Optional
    required
    Description
    Page number.
  • content
    content
    Type
    string
    Required or Optional
    required
    Description
    Full content of the element.
  • summary
    summary
    Type
    string
    Required or Optional
    optional
    Description
    Optional summary of the element.
  • page_sizes
    page_sizes
    Type
    number[][]
    Required or Optional
    optional
    Description
    List of [width, height] tuples for each page.
  • started_at
    started_at
    Type
    string
    Required or Optional
    optional
    Description
    Timestamp when the job started processing.
  • finished_at
    finished_at
    Type
    string
    Required or Optional
    optional
    Description
    Timestamp when the job finished processing.
  • created_at
    created_at
    Type
    string
    Required or Optional
    required
    Description
    Timestamp when the job was created.
  • updated_at
    updated_at
    Type
    string
    Required or Optional
    optional
    Description
    Timestamp when the job was last updated.
  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "parsing_job".

Retrieve Parsing Job

GET/v1/parsing/jobs/{job_id}

Retrieves the current status and details of a specific parsing job, including the result if completed.

    Path Parameters

  • job_id
    job_id
    Type
    string
    Required or Optional
    required
    Description

    The unique identifier of the parsing job to retrieve.

Response Body

  • id
    id
    Type
    string
    Required or Optional
    required
    Description
    Job identifier.
  • file_id
    file_id
    Type
    string
    Required or Optional
    required
    Description
    Parsed file ID.
  • status
    status
    Type
    string
    Required or Optional
    required
    Description
    Current status.
  • error
    error
    Type
    object
    Required or Optional
    optional
    Description
    Error details if failed.
  • result
    result
    Type
    object
    Required or Optional
    optional
    Description
    Parsing results if completed.
  • started_at
    started_at
    Type
    string
    Required or Optional
    optional
    Description
    Start timestamp.
  • finished_at
    finished_at
    Type
    string
    Required or Optional
    optional
    Description
    Finish timestamp.
  • created_at
    created_at
    Type
    string
    Required or Optional
    required
    Description
    Creation timestamp.
  • updated_at
    updated_at
    Type
    string
    Required or Optional
    optional
    Description
    Last update timestamp.
  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "parsing_job".

List Parsing Jobs

GET/v1/parsing/jobs

Retrieves a list of parsing jobs associated with your account, supporting pagination. Note that the results (result and error fields) are omitted in the list view for brevity.

    Query Parameters

  • limit
    limit
    Type
    integer
    Required or Optional
    optional
    Description

    Maximum number of jobs to return.

    • Default: 20, Max: 100
  • offset
    offset
    Type
    integer
    Required or Optional
    optional
    Description

    Number of jobs to skip.

    • Default: 0

Response Body

  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "list".
  • data
    data
    Type
    JobListResponse[]
    Required or Optional
    required
    Description

    A list of parsing job objects (excluding result and error).

  • id
    id
    Type
    string
    Required or Optional
    required
    Description
    Job identifier.
  • file_id
    file_id
    Type
    string
    Required or Optional
    required
    Description
    Parsed file ID.
  • status
    status
    Type
    string
    Required or Optional
    required
    Description
    Current status.
  • started_at
    started_at
    Type
    string
    Required or Optional
    optional
    Description
    Start timestamp.
  • finished_at
    finished_at
    Type
    string
    Required or Optional
    optional
    Description
    Finish timestamp.
  • created_at
    created_at
    Type
    string
    Required or Optional
    required
    Description
    Creation timestamp.
  • updated_at
    updated_at
    Type
    string
    Required or Optional
    optional
    Description
    Last update timestamp.
  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "parsing_job".
  • pagination
    pagination
    Type
    object
    Required or Optional
    required
    Description

    Pagination information.

  • total
    total
    Type
    integer
    Required or Optional
    optional
    Description
    Total number of jobs available.
  • offset
    offset
    Type
    integer
    Required or Optional
    optional
    Description
    The offset used for this page.

Delete Parsing Job

DELETE/v1/parsing/jobs/{job_id}

Deletes a specific parsing job record. This does not delete the original file or the parsed results if stored elsewhere (e.g., in a vector store).

    Path Parameters

  • job_id
    job_id
    Type
    string
    Required or Optional
    required
    Description

    The unique identifier of the parsing job to delete.

Response Body

  • id
    id
    Type
    string
    Required or Optional
    required
    Description
    The ID of the deleted job.
  • deleted
    deleted
    Type
    boolean
    Required or Optional
    required
    Description
    Indicates if the deletion was successful.
  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "parsing_job".

Cancel Parsing Job

PATCH/v1/parsing/jobs/{job_id}

Attempts to cancel a parsing job that is currently pending or in_progress. If successful, the job status will transition to cancelled.

    Path Parameters

  • job_id
    job_id
    Type
    string
    Required or Optional
    required
    Description

    The unique identifier of the parsing job to cancel.

Response Body

  • id
    id
    Type
    string
    Required or Optional
    required
    Description
    Job identifier.
  • file_id
    file_id
    Type
    string
    Required or Optional
    required
    Description
    Parsed file ID.
  • status
    status
    Type
    string
    Required or Optional
    required
    Description
    Current status.
    • Should be 'cancelled' if successful
  • error
    error
    Type
    object
    Required or Optional
    optional
    Description
    Error details if cancellation failed or job already finished.
  • result
    result
    Type
    object
    Required or Optional
    optional
    Description
    Parsing results if completed before cancel.
  • started_at
    started_at
    Type
    string
    Required or Optional
    optional
    Description
    Start timestamp.
  • finished_at
    finished_at
    Type
    string
    Required or Optional
    optional
    Description
    Finish timestamp (set upon cancellation).
  • created_at
    created_at
    Type
    string
    Required or Optional
    required
    Description
    Creation timestamp.
  • updated_at
    updated_at
    Type
    string
    Required or Optional
    optional
    Description
    Last update timestamp.
  • object
    object
    Type
    string
    Required or Optional
    required
    Description
    Always "parsing_job".

Last updated on