This guide provides step-by-step instructions for deploying the HA RKE2 Kubernetes cluster on AWS.
- Prerequisites
- Step 1: Clone the Repository
- Step 2: Configure AWS Credentials
- Step 3: Generate SSH Key Pair
- Step 4: Configure Terraform Variables
- Step 5: Initialize Terraform
- Step 6: Plan the Deployment
- Step 7: Apply the Configuration
- Step 8: Verify Deployment
- Step 9: Access the Cluster
- Step 10: Deploy Sample Application
- Cleanup
- Customization Options
| Tool | Version | Installation |
|---|---|---|
| Terraform | >= 1.5.0 | Download |
| AWS CLI | >= 2.0 | Install Guide |
| kubectl | >= 1.28 | Install Guide |
| SSH client | Any | Built-in (Linux/Mac) or PuTTY (Windows) |
# Check Terraform
terraform version
# Expected: Terraform v1.5.0 or higher
# Check AWS CLI
aws --version
# Expected: aws-cli/2.x.x
# Check kubectl
kubectl version --client
# Expected: Client Version: v1.28.x or higher- AWS Account with administrative access
- IAM Permissions for:
- VPC (create, modify, delete)
- EC2 (instances, security groups, key pairs)
- ELB (load balancers, target groups)
- IAM (read-only for current user)
# Check current identity
aws sts get-caller-identity
# Expected output:
{
"UserId": "AIDAXXXXXXXXXXXXXXXXX",
"Account": "123456789012",
"Arn": "arn:aws:iam::123456789012:user/your-username"
}# Clone the repository
git clone https://github.com/deviant101/ha-rke2-kubernetes-cluster.git
cd ha-rke2-kubernetes-cluster
# Verify structure
ls -la
# Expected:
# README.md
# docs/
# terraform/export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_DEFAULT_REGION="us-east-1"# Configure default profile
aws configure
# You'll be prompted for:
AWS Access Key ID [None]: your-access-key
AWS Secret Access Key [None]: your-secret-key
Default region name [None]: us-east-1
Default output format [None]: json# Configure named profile
aws configure --profile rke2-cluster
# Use the profile
export AWS_PROFILE=rke2-cluster# Test access
aws ec2 describe-regions --query 'Regions[].RegionName' --output table# Generate SSH key pair
ssh-keygen -t ed25519 -C "rke2-cluster" -f ~/.ssh/rke2-cluster-key
# Set correct permissions
chmod 600 ~/.ssh/rke2-cluster-key
chmod 644 ~/.ssh/rke2-cluster-key.pub
# View the public key (you'll reference this path later)
cat ~/.ssh/rke2-cluster-key.pubIf you have an existing key:
# Verify key exists
ls -la ~/.ssh/id_ed25519.pub
# or
ls -la ~/.ssh/id_rsa.pubcd terraform
# Copy example file
cp terraform.tfvars.example terraform.tfvars
# Edit the configuration
nano terraform.tfvars # or vim, code, etc.# terraform.tfvars
# Required
ssh_public_key_path = "~/.ssh/rke2-cluster-key.pub"
# Optional - customize as needed
aws_region = "us-east-1"
cluster_name = "my-rke2-cluster"
environment = "dev"# terraform.tfvars
# AWS Configuration
aws_region = "us-east-1"
# Cluster Identity
cluster_name = "production-rke2"
environment = "production"
# Network Configuration
vpc_cidr = "10.0.0.0/16"
public_subnet_cidrs = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
pod_cidr = "10.42.0.0/16"
service_cidr = "10.43.0.0/16"
# Node Configuration
control_plane_count = 3
worker_count = 3
control_plane_instance_type = "t3.large"
worker_instance_type = "t3.large"
root_volume_size = 100
# RKE2 Version
rke2_version = "v1.34.6+rke2r1"
# SSH Access
ssh_public_key_path = "~/.ssh/rke2-cluster-key.pub"
# Security - Restrict SSH access to your IP
admin_cidr_blocks = ["YOUR.PUBLIC.IP.ADDRESS/32"]
# Optional - Provide your own token (or let Terraform generate one)
# rke2_token = "your-secure-token-here"# Get your current public IP
curl -s ifconfig.me
# or
curl -s ipinfo.io/ipcd terraform
# Initialize Terraform (downloads providers)
terraform initInitializing the backend...
Initializing provider plugins...
- Finding hashicorp/aws versions matching "~> 5.0"...
- Finding hashicorp/random versions matching "~> 3.0"...
- Installing hashicorp/aws v5.x.x...
- Installing hashicorp/random v3.x.x...
Terraform has been successfully initialized!
# Generate and review the execution plan
terraform planThe plan will show:
- Resources to create (VPC, subnets, EC2 instances, NLB, etc.)
- Configuration details (instance types, CIDRs, etc.)
- No resources to destroy (new deployment)
Plan: 25 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ cluster_name = "my-rke2-cluster"
+ control_plane_private_ips = (known after apply)
+ control_plane_public_ips = (known after apply)
+ kubernetes_api_endpoint = (known after apply)
+ nlb_dns_name = (known after apply)
...
# Save plan for later apply
terraform plan -out=tfplan
# Later, apply the saved plan
terraform apply tfplan# Deploy the cluster
terraform applyDo you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
┌─────────────────────────────────────────────────────────────────────────┐
│ DEPLOYMENT TIMELINE │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ T+0:00 Terraform starts creating resources │
│ - VPC, Subnets, Internet Gateway │
│ - Security Groups │
│ - Network Load Balancer │
│ │
│ T+2:00 EC2 instances launch │
│ - Control Plane nodes start user-data scripts │
│ - Workers wait for control plane │
│ │
│ T+5:00 First control plane initializes │
│ - RKE2 server starts │
│ - etcd cluster initializes │
│ - API server starts │
│ │
│ T+8:00 Additional control planes join │
│ - CP-2 and CP-3 join via NLB │
│ - etcd achieves quorum │
│ │
│ T+12:00 Workers join cluster │
│ - RKE2 agents connect │
│ - Cilium configured │
│ │
│ T+15:00 Cluster fully operational │
│ - All nodes Ready │
│ - System pods running │
│ │
│ TOTAL: ~15-20 minutes │
│ │
└─────────────────────────────────────────────────────────────────────────┘
After successful apply:
Apply complete! Resources: 25 added, 0 changed, 0 destroyed.
Outputs:
cluster_name = "my-rke2-cluster"
control_plane_private_ips = [
"10.0.1.100",
"10.0.2.101",
"10.0.3.102",
]
control_plane_public_ips = [
"54.123.45.67",
"54.123.45.68",
"54.123.45.69",
]
kubernetes_api_endpoint = "https://my-rke2-cluster-nlb-1234567890.elb.us-east-1.amazonaws.com:6443"
kubeconfig_command = "ssh -i ~/.ssh/rke2-cluster-key ubuntu@54.123.45.67 'sudo cat /etc/rancher/rke2/rke2.yaml' | sed 's/127.0.0.1/my-rke2-cluster-nlb-1234567890.elb.us-east-1.amazonaws.com/g' > kubeconfig.yaml"
nlb_dns_name = "my-rke2-cluster-nlb-1234567890.elb.us-east-1.amazonaws.com"
...
# List EC2 instances
aws ec2 describe-instances \
--filters "Name=tag:Cluster,Values=my-rke2-cluster" \
--query 'Reservations[].Instances[].[InstanceId,State.Name,Tags[?Key==`Name`].Value|[0]]' \
--output table# Get SSH command from Terraform output
terraform output -raw ssh_control_plane_commands
# SSH to first control plane
ssh -i ~/.ssh/rke2-cluster-key ubuntu@<CONTROL_PLANE_PUBLIC_IP># On the control plane node:
# Check RKE2 service status
sudo systemctl status rke2-server
# Check node status
sudo /var/lib/rancher/rke2/bin/kubectl \
--kubeconfig /etc/rancher/rke2/rke2.yaml \
get nodes
# Check all pods
sudo /var/lib/rancher/rke2/bin/kubectl \
--kubeconfig /etc/rancher/rke2/rke2.yaml \
get pods -A# Use the command from Terraform output
$(terraform output -raw kubeconfig_command)
# Verify the file was created
cat kubeconfig.yaml# Option A: Set KUBECONFIG environment variable
export KUBECONFIG=$(pwd)/kubeconfig.yaml
# Option B: Copy to default location
mkdir -p ~/.kube
cp kubeconfig.yaml ~/.kube/config
chmod 600 ~/.kube/config# Check cluster info
kubectl cluster-info
# Expected output:
Kubernetes control plane is running at https://my-rke2-cluster-nlb-xxx.elb.us-east-1.amazonaws.com:6443
CoreDNS is running at https://...
# List nodes
kubectl get nodes
# Expected output:
NAME STATUS ROLES AGE VERSION
ip-10-0-1-100.ec2.internal Ready control-plane,etcd,master 10m v1.34.6+rke2r1
ip-10-0-2-101.ec2.internal Ready control-plane,etcd,master 8m v1.34.6+rke2r1
ip-10-0-3-102.ec2.internal Ready control-plane,etcd,master 6m v1.34.6+rke2r1
ip-10-0-1-200.ec2.internal Ready <none> 4m v1.34.6+rke2r1
ip-10-0-2-201.ec2.internal Ready <none> 4m v1.34.6+rke2r1
ip-10-0-3-202.ec2.internal Ready <none> 4m v1.34.6+rke2r1kubectl get pods -n kube-system
# Expected pods:
NAME READY STATUS RESTARTS AGE
cilium-xxxxx 1/1 Running 0 10m
cilium-operator-xxxxx 1/1 Running 0 10m
coredns-xxxxx 1/1 Running 0 10m
rke2-coredns-rke2-coredns-xxxxx 1/1 Running 0 10m
...kubectl create namespace democat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-demo
namespace: demo
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
namespace: demo
spec:
type: NodePort
selector:
app: nginx
ports:
- port: 80
targetPort: 80
nodePort: 30080
EOF# Check pods
kubectl get pods -n demo
# Check service
kubectl get svc -n demo
# Test the application (from any worker node)
curl http://<WORKER_PUBLIC_IP>:30080cd terraform
# Preview destruction
terraform plan -destroy
# Destroy resources
terraform destroy# Delete EC2 instances
aws ec2 terminate-instances --instance-ids <INSTANCE_IDS>
# Delete load balancer
aws elbv2 delete-load-balancer --load-balancer-arn <LB_ARN>
# Delete VPC (after all resources are removed)
aws ec2 delete-vpc --vpc-id <VPC_ID># terraform.tfvars
aws_region = "eu-west-1" # Ireland
# or
aws_region = "ap-southeast-1" # Singapore# terraform.tfvars
control_plane_instance_type = "m5.xlarge"
worker_instance_type = "m5.2xlarge"# terraform.tfvars
worker_count = 5 # or any number# terraform.tfvars
rke2_version = "v1.30.0+rke2r1"Find available versions: https://github.com/rancher/rke2/releases
# terraform.tfvars
vpc_cidr = "172.16.0.0/16"
public_subnet_cidrs = ["172.16.1.0/24", "172.16.2.0/24", "172.16.3.0/24"]
pod_cidr = "172.20.0.0/16"
service_cidr = "172.21.0.0/16"After successful deployment:
- Install Ingress Controller (nginx-ingress or traefik)
- Configure DNS for your applications
- Set up monitoring (Prometheus + Grafana)
- Configure backup for etcd snapshots
- Implement GitOps (ArgoCD or Flux)
Back to Main README | Previous: HA RKE2 Guide | Next: Flow Diagrams