A kubectl plugin for deploying and managing AI/ML models using the Kubernetes AI Toolchain Operator (Kaito).
kubectl-kaito simplifies AI model deployment on Kubernetes by providing an intuitive command-line interface that abstracts away complex YAML configurations. Deploy, manage, and interact with large language models and other AI workloads with simple commands.
- One-command deployment Deploy AI models with a single command that automatically provisions GPU nodes and configures the inference stack
- HuggingFace model support Deploy any model from HuggingFace by using its model ID (e.g.,
Qwen/Qwen3-4B-Instruct-2507) - Real-time monitoring Monitor workspace deployment status with real-time conditions, NodeClaim tracking, and detailed health checks
- OpenAI-compatible APIs Interact with deployed models through an OpenAI-compatible chat interface with customizable system prompts
- Model discovery Browse and discover Kaito pre-configured AI models with detailed specifications and GPU requirements
- Seamless endpoint access Access inference endpoints automatically using Kubernetes API proxy - works anywhere kubectl works without manual setup
# List available models or use a HuggingFace model ID
kubectl kaito models list
# Deploy a Kaito preset model for inference
kubectl kaito deploy --workspace-name my-workspace \
--model phi-3.5-mini-instruct \
--instance-type Standard_NC6s_v3
# Or deploy any HuggingFace model
kubectl kaito deploy --workspace-name my-workspace \
--model Qwen/Qwen3-4B-Instruct-2507 \
--model-access-secret hf-token
# Check deployment status
kubectl kaito status --workspace-name my-workspace
# Get inference endpoint
kubectl kaito get-endpoint --workspace-name my-workspace
# Start interactive chat
kubectl kaito chat --workspace-name my-workspace- Kubernetes cluster with GPU nodes
- Kaito operator installed in your cluster
- kubectl configured to access your cluster
Prerequisites: Install krew if you haven't already.
kubectl krew install kaito# Get the script
curl -sO https://raw.githubusercontent.com/kaito-project/kaito-kubectl-plugin/refs/heads/main/hack/generate-krew-manifest.sh
export RELEASE_TAG=v0.1.1
# Generate manifest for a specific version with real SHA256 values
chmod +x ./generate-krew-manifest.sh && ./generate-krew-manifest.sh $RELEASE_TAG
# Install the generated manifest
kubectl krew install --manifest=krew/kaito-$RELEASE_TAG.yamlkubectl kaito --help# Deploy Phi-3.5 Mini for general inference
kubectl kaito deploy \
--workspace-name phi-workspace \
--model phi-3.5-mini-instruct \
--instance-type Standard_NC6s_v3
# Monitor deployment
kubectl kaito status --workspace-name phi-workspace --watch
# Test the deployment
kubectl kaito chat --workspace-name phi-workspace# Create a secret with your HuggingFace token
kubectl create secret generic hf-token --from-literal=HF_TOKEN=your_token
# Deploy any HuggingFace model using its model ID
kubectl kaito deploy \
--workspace-name my-llama \
--model Qwen/Qwen3-4B-Instruct-2507 \
--model-access-secret hf-token \
--instance-type Standard_NC24ads_A100_v4
# Kaito automatically generates the preset configuration
# and validates the model architecture against vLLM# Fine-tune a model with your data
kubectl kaito deploy \
--workspace-name tune-phi \
--model phi-3.5-mini-instruct \
--tuning \
--tuning-method qlora \
--input-urls "https://example.com/training-data.parquet" \
--output-image "myregistry.azurecr.io/phi-tuned:v1" \
--output-image-secret my-registry-secret
# Deploy the fine-tuned model
kubectl kaito deploy \
--workspace-name phi-tuned \
--model phi-3.5-mini-instruct \
--adapters phi-adapter="myregistry.azurecr.io/phi-tuned:v1"| Command | Description |
|---|---|
deploy |
Deploy a Kaito workspace for model inference or fine-tuning |
status |
Check status of Kaito workspaces |
get-endpoint |
Get inference endpoints for a workspace |
chat |
Interactive chat with deployed AI models |
models |
Manage and list supported AI models |
# Clone the repository
git clone https://github.com/kaito-project/kaito-kubectl-plugin.git
cd kaito-kubectl-plugin
# Build the plugin
make build
# Make sure to uninstall the krew plugin to be able to run the local binary
kubectl krew uninstall kaito
# Run the cli from the local binary
./bin/kubectl-kaito --helpThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.
