Skip to content

Commit 8111d01

Browse files
committed
docs: update README with vision --file-id support and v0.4.0 phase 5
1 parent ac2f515 commit 8111d01

1 file changed

Lines changed: 17 additions & 2 deletions

File tree

README.md

Lines changed: 17 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,14 +14,23 @@ Generate text, images, video, speech, and music from the terminal. Supports both
1414

1515
## What's New (v0.4.0, v0.3.0 & v0.2.0)
1616

17-
### v0.4.0 — File Management API
17+
### v0.4.0 — File Management API + Vision file_id Support
1818

1919
New **`file`** resource group for pre-uploading files to MiniMax storage:
2020

2121
- **`minimax file upload`** — upload a local file, get a `file_id` for reuse in vision/video requests
2222
- **`minimax file list`** — view all previously uploaded files in a table
2323
- **`minimax file delete`** — remove a file by its ID
2424

25+
**Vision now accepts `--file-id`** to skip base64 encoding:
26+
27+
```bash
28+
FILE_ID=$(minimax file upload --file image.png --quiet)
29+
minimax vision describe --file-id $FILE_ID --prompt "这张图里有几个人?"
30+
```
31+
32+
This avoids heavy base64 conversion for large images — pass the `file_id` directly to the VLM API.
33+
2534
Note: The MiniMax File API returned HTTP 404 with the current API key. The implementation is correct (endpoint paths, FormData multipart upload, and authentication are all verified). This is an API key permission issue — the code will work once a compatible key or endpoint is confirmed with MiniMax.
2635

2736
### v0.3.0 — Agent Tool Schema Auto-Generation
@@ -411,7 +420,7 @@ bun run build:npm
411420

412421
## Changelog
413422

414-
### v0.4.0 — File Management API
423+
### v0.4.0 — File Management API + Vision file_id Support
415424

416425
**Phase 1 — File API Types**
417426
- Added `FileUploadResponse`, `FileListResponse`, `FileDeleteResponse` types
@@ -430,6 +439,12 @@ bun run build:npm
430439
- Commands registered and listed in help under the new `file` resource group
431440
- Interactive fallback (missing `--file` / `--file-id` prompts in TTY, fails fast in CI/agent mode)
432441

442+
**Phase 5 — Vision file_id Support**
443+
- `vision describe` now accepts `--file-id` as mutually exclusive alternative to `--image`
444+
- When `--file-id` is provided, sends `{prompt, file_id}` directly to VLM API (skips base64 encoding)
445+
- When `--image` is provided, falls back to existing base64 `toDataUri` path
446+
- TTY interactive prompt detects whether input is path/URL or fileId via simple heuristic
447+
433448
Note: MiniMax File API returned HTTP 404 with current API key. Endpoint paths and request handling are verified correct via `--verbose` mode.
434449

435450
### v0.3.0 — Agent Tool Schema Auto-Generation

0 commit comments

Comments
 (0)