Multipart Upload
Both upload and uploadAndPoll / upload_and_poll automatically use multipart upload for large files. The file is split into chunks and uploaded in parallel, improving reliability and speed. No extra configuration is needed — it just works.
Default Behavior
Files over 100MB automatically use multipart upload. The existing upload calls work without any changes:
const file = await mxbai.stores.files.upload({
storeIdentifier: "my-knowledge-base",
file: fs.createReadStream("./large-document.pdf"),
});file = mxbai.stores.files.upload(
store_identifier="my-knowledge-base",
file=Path("./large-document.pdf"),
)Custom Threshold
Lower the threshold to trigger multipart upload for smaller files:
const file = await mxbai.stores.files.upload({
storeIdentifier: "my-knowledge-base",
file: fs.createReadStream("./document.pdf"),
multipartUpload: {
threshold: 50 * 1024 * 1024, // 50MB instead of default 100MB
},
});from mixedbread.lib.multipart_upload import MultipartUploadOptions
file = mxbai.stores.files.upload(
store_identifier="my-knowledge-base",
file=Path("./document.pdf"),
multipart_upload=MultipartUploadOptions(
threshold=50 * 1024 * 1024, # 50MB instead of default 100MB
),
)Custom Concurrency
Control how many parts upload in parallel. The default is 5 concurrent uploads:
const file = await mxbai.stores.files.upload({
storeIdentifier: "my-knowledge-base",
file: fs.createReadStream("./large-document.pdf"),
multipartUpload: {
concurrency: 10,
},
});from mixedbread.lib.multipart_upload import MultipartUploadOptions
file = mxbai.stores.files.upload(
store_identifier="my-knowledge-base",
file=Path("./large-document.pdf"),
multipart_upload=MultipartUploadOptions(
concurrency=10,
),
)Custom Part Size
Control the size of each chunk. The default part size is 100MB:
const file = await mxbai.stores.files.upload({
storeIdentifier: "my-knowledge-base",
file: fs.createReadStream("./large-document.pdf"),
multipartUpload: {
partSize: 50 * 1024 * 1024, // 50MB parts
},
});from mixedbread.lib.multipart_upload import MultipartUploadOptions
file = mxbai.stores.files.upload(
store_identifier="my-knowledge-base",
file=Path("./large-document.pdf"),
multipart_upload=MultipartUploadOptions(
part_size=50 * 1024 * 1024, # 50MB parts
),
)Progress Tracking
Get notified after each part finishes uploading. Works with both upload and uploadAndPoll / upload_and_poll:
from pathlib import Path
from mixedbread import Mixedbread
from mixedbread.lib.multipart_upload import MultipartUploadOptions
mxbai = Mixedbread(api_key="YOUR_API_KEY")
def on_part_upload(event):
pct = round((event.uploaded_bytes / event.total_bytes) * 100)
print(f"Part {event.part_number}/{event.total_parts} done — {pct}%")
file = mxbai.stores.files.upload_and_poll(
store_identifier="my-knowledge-base",
file=Path("./large-document.pdf"),
multipart_upload=MultipartUploadOptions(
on_part_upload=on_part_upload,
),
)
print(file)The callback receives a PartUploadEvent with the following fields:
| Field | Description |
|---|---|
partNumber / part_number | 1-based part number that completed |
totalParts / total_parts | Total number of parts in this upload |
partSize / part_size | Size of this part in bytes |
uploadedBytes / uploaded_bytes | Cumulative bytes uploaded so far |
totalBytes / total_bytes | Total file size in bytes |
All Options Together
Combine threshold, part size, concurrency, and progress tracking:
from pathlib import Path
from mixedbread import Mixedbread
from mixedbread.lib.multipart_upload import MultipartUploadOptions
mxbai = Mixedbread(api_key="YOUR_API_KEY")
def on_part_upload(event):
pct = round((event.uploaded_bytes / event.total_bytes) * 100)
print(f"Part {event.part_number}/{event.total_parts} — {pct}%")
file = mxbai.stores.files.upload(
store_identifier="my-knowledge-base",
file=Path("./large-document.pdf"),
multipart_upload=MultipartUploadOptions(
threshold=50 * 1024 * 1024, # 50MB threshold
part_size=25 * 1024 * 1024, # 25MB parts
concurrency=10,
on_part_upload=on_part_upload,
),
)
print(file)Options Reference
| Option | Default | Description |
|---|---|---|
threshold | 100MB | Minimum file size to trigger multipart upload |
partSize / part_size | 100MB | Size of each upload chunk |
concurrency | 5 | Number of parts uploaded in parallel |
onPartUpload / on_part_upload | — | Callback invoked after each part completes |