18 KiB
Cloud GPU Setup Guide for OrthoRoute
Complete instructions for running OrthoRoute headless routing on Vast.ai or other cloud GPU providers
Last Updated: November 15, 2025
Step 1: Rent GPU Instance on Vast.ai
Recommended Specifications
For boards with <2,000 nets:
- GPU: RTX 4090 (24 GB VRAM)
- Cost: ~$0.40/hr
- Sufficient for most boards
For boards with 2,000-8,000 nets:
- GPU: RTX 6000 Ada (48 GB VRAM) or A100 80GB
- Cost: ~$0.80-1.50/hr
- Needed for large backplanes
For boards with >8,000 nets:
- GPU: H100 80GB or A100 80GB
- Cost: ~$1.50-2.50/hr
- Maximum capacity
On Vast.ai Website
- Go to https://vast.ai/console/create/
- Filter instances:
- GPU Type: RTX 4090, RTX 6000 Ada, or A100
- VRAM: ≥ 24 GB (48+ GB for large boards)
- Disk Space: ≥ 20 GB
- CUDA Version: 12.x or later
- Sort by price ($/hr)
- Click "Rent" on suitable instance
- Select:
- Image:
pytorch/pytorch:latest(has CUDA + Python pre-installed) - Or:
nvidia/cuda:12.2.0-devel-ubuntu22.04
- Image:
- Click "Create"
Get SSH Connection Info
After instance starts (30-60 seconds):
- Click on instance in dashboard
- Copy SSH command shown (looks like):
ssh -p 12345 root@ssh.vast.ai -L 8080:localhost:8080 - Or use direct IP if shown
Step 2: Connect and Setup Environment
SSH into Instance
# Use the SSH command from Vast.ai dashboard
ssh -p 12345 root@ssh.vast.ai
You should see a prompt like:
root@C.27877234:~#
Install System Dependencies
# Update package manager
apt-get update
# Install git and basic tools
apt-get install -y git tmux htop
# Verify CUDA is available
nvidia-smi
# Should show GPU info (e.g., RTX 4090, 24GB VRAM)
# Verify Python version
python3 --version
# Should be Python 3.8 or later
Step 3: Clone OrthoRoute Repository
# Navigate to workspace
cd /workspace
# Clone repository
git clone https://github.com/bbenchoff/OrthoRoute.git
cd OrthoRoute
# Verify files
ls -la
# Should see: main.py, orthoroute/, logs/, etc.
If using a private repository:
# Option 1: Use HTTPS with token
git clone https://YOUR_TOKEN@github.com/YourUsername/OrthoRoute.git
# Option 2: Use SSH (need to add SSH key to GitHub first)
git clone git@github.com:YourUsername/OrthoRoute.git
Step 4: Install Python Dependencies
Check CUDA Version
nvcc --version
# Note the CUDA version (e.g., 12.2, 12.4, etc.)
Install CuPy (GPU acceleration library)
For CUDA 12.x:
pip3 install cupy-cuda12x
For CUDA 11.x:
pip3 install cupy-cuda11x
Verify CuPy installation:
python3 -c "import cupy as cp; print(cp.__version__); print('GPU Available:', cp.cuda.is_available())"
# Should print: GPU Available: True
Install Other Dependencies
# Install NumPy and SciPy
pip3 install numpy scipy
# Verify installations
python3 -c "import numpy; import scipy; print('NumPy:', numpy.__version__, 'SciPy:', scipy.__version__)"
Complete dependency list:
pip3 install cupy-cuda12x numpy scipy
Note: Don't install PyQt6 (GUI not needed for headless mode).
Step 5: Upload Your ORP File
From Your Local Machine
Using SCP:
# On your local machine (not on the Vast instance):
scp -P 12345 MainController.ORP root@ssh.vast.ai:/workspace/OrthoRoute/
# Replace:
# 12345 - with your actual port from Vast.ai
# MainController.ORP - with your actual ORP filename
Verify upload:
# Back on the Vast instance:
cd /workspace/OrthoRoute
ls -lh *.ORP
# Should show your ORP file
Alternative: Upload to Cloud Storage First
If ORP file is large:
# On local machine: Upload to temporary host
# curl -F "file=@MainController.ORP" https://file.io
# Gets back a URL
# On Vast instance: Download
wget https://file.io/XXXXXX -O MainController.ORP
Step 6: Run OrthoRoute Headless Mode
Using tmux (Recommended - survives SSH disconnects)
# Start new tmux session
tmux new -s routing
# Inside tmux, run OrthoRoute
cd /workspace/OrthoRoute
python3 main.py headless MainController.ORP
# Detach from tmux (keeps running in background):
# Press: Ctrl+b, then d
# Later, reattach to see progress:
tmux attach -t routing
# Kill session when done:
tmux kill-session -t routing
Direct Run (Simpler but dies if SSH disconnects)
cd /workspace/OrthoRoute
python3 main.py headless MainController.ORP
With Options
# Increase iterations for complex boards
python3 main.py headless MainController.ORP --max-iterations 150
# Force CPU mode if GPU runs out of memory
python3 main.py headless MainController.ORP --cpu-only
# Custom output filename
python3 main.py headless MainController.ORP -o CustomName.ORS
Step 7: Monitor Progress
Watch Live Console Output
If using tmux:
tmux attach -t routing
If running directly: Already showing in your terminal.
Tail Log Files
# In a second SSH session or tmux pane:
cd /workspace/OrthoRoute
# Watch latest log file
tail -f logs/run_*.log | grep "WARNING"
# Or just iteration summaries:
tail -f logs/run_*.log | grep "ITER.*nets="
# Or with watch command:
watch -n 2 'tail -5 logs/run_*.log'
Monitor GPU Usage
# Watch GPU utilization every 5 seconds
nvidia-smi -l 5
# Or with watch:
watch -n 5 nvidia-smi
What to look for:
- GPU Utilization: Should be 80-100%
- GPU Memory: Should be stable (not growing infinitely)
- Power Usage: Should be near max (e.g., 350W for RTX 4090)
Check Disk Space
# Iteration 1 on 8K nets creates LARGE log files
df -h
# If disk getting full, you can compress or delete old logs:
gzip logs/old_run_*.log
Step 8: Handle Common Issues
Out of Memory Error
Error:
cupy.cuda.memory.OutOfMemoryError: Out of memory allocating X bytes
Solutions:
A) Upgrade to larger GPU:
- Kill current job:
pkill -f main.py - Destroy instance on Vast.ai
- Rent instance with more VRAM (48+ GB)
- Restart from Step 1
B) Use CPU mode:
pkill -f main.py
python3 main.py headless MainController.ORP --cpu-only
C) Reduce batch size (requires code change - not recommended)
Process Killed / SSH Disconnected
If you weren't using tmux:
- Routing stopped when SSH died
- Must restart from scratch
If you were using tmux:
# Reconnect to Vast instance
ssh -p 12345 root@ssh.vast.ai
# Reattach to tmux session
tmux attach -t routing
# Routing should still be running!
Instance Becomes Unresponsive
If SSH hangs or times out:
- Instance might have crashed
- Check Vast.ai dashboard - instance status
- If "stopped", you'll need to restart
- Unfortunately, routing progress lost (no checkpointing yet)
Logs Too Large
8K net routing can create 10+ GB log files:
# Check log size
du -h logs/
# Compress old logs to save space
gzip logs/run_*.log
# Or delete very old logs
rm logs/run_2025111*.log
Step 9: Download Results
When Routing Completes
You'll see:
================================================================================
ROUTING COMPLETE!
================================================================================
Solution file: MainController.ORS
...
Download ORS File to Local Machine
Using SCP (from your local machine):
scp -P 12345 root@ssh.vast.ai:/workspace/OrthoRoute/MainController.ORS ./
# Replace:
# 12345 - your Vast.ai port
# MainController.ORS - your actual ORS filename
# ./ - current directory (or specify path)
Using cloud storage:
# On Vast instance: Upload to file sharing service
curl -F "file=@MainController.ORS" https://file.io
# Returns download URL
# On local machine: Download
wget https://file.io/XXXXXX -O MainController.ORS
Verify file integrity:
# On local machine, check file is valid gzip:
gzip -t MainController.ORS && echo "File OK" || echo "File corrupted"
# Check file size (should be ~500KB - 5MB):
ls -lh MainController.ORS
Step 10: Import into KiCad
On your local machine:
- Open KiCad with your board
- Launch OrthoRoute plugin
- Press Ctrl+I (or File → Import Solution)
- Select
MainController.ORS - Review routing in preview
- Click "Apply to KiCad" to commit traces/vias
Complete Example Session
Session Recording
# === ON LOCAL MACHINE ===
# 1. Export board
# (In KiCad OrthoRoute plugin: Ctrl+E → save MainController.ORP)
# 2. Upload to Vast
scp -P 12345 MainController.ORP root@ssh.vast.ai:/workspace/
# === ON VAST.AI INSTANCE ===
# 3. SSH in
ssh -p 12345 root@ssh.vast.ai
# 4. Setup
cd /workspace
git clone https://github.com/YourUser/OrthoRoute.git
cd OrthoRoute
pip3 install cupy-cuda12x numpy scipy
# 5. Verify GPU
nvidia-smi
python3 -c "import cupy; print('GPU:', cupy.cuda.is_available())"
# 6. Start tmux session
tmux new -s routing
# 7. Run routing
python3 main.py headless MainController.ORP
# 8. Detach from tmux (Ctrl+b, then d)
# 9. Monitor progress (optional)
tail -f logs/run_*.log | grep "ITER.*nets="
# 10. Wait for completion (check back in 4-8 hours)
# 11. Download result
exit # Exit SSH
# === BACK ON LOCAL MACHINE ===
# 12. Download ORS file
scp -P 12345 root@ssh.vast.ai:/workspace/OrthoRoute/MainController.ORS ./
# 13. Import into KiCad (Ctrl+I)
# 14. Destroy Vast instance (stop billing)
# (In Vast.ai dashboard: click Destroy)
Cost Estimation
Typical Costs by Board Size
Small board (100-500 nets):
- Time: 10-30 minutes
- GPU: RTX 4090 @ $0.40/hr
- Cost: $0.20
Medium board (500-2,000 nets):
- Time: 30 minutes - 2 hours
- GPU: RTX 4090 @ $0.40/hr
- Cost: $0.80
Large board (2,000-8,000 nets):
- Time: 4-12 hours
- GPU: RTX 6000 Ada (48GB) @ $0.80/hr
- Cost: $6-10
Huge board (8,000+ nets):
- Time: 12-24 hours
- GPU: A100 80GB @ $1.50/hr
- Cost: $18-36
vs. buying RTX 4090: ~$1,600
Break-even: ~40 large routing jobs (or never, if you value your time)
Tips & Tricks
1. Use tmux ALWAYS
# Start every session with:
tmux new -s routing
# Detach: Ctrl+b, then d
# Reattach: tmux attach -t routing
Why: If SSH disconnects, routing keeps going. Saved me countless times.
2. Monitor Without Attaching
# See what's happening in tmux without attaching:
tmux capture-pane -t routing -p | tail -20
3. Multiple Sessions for Monitoring
# Window 1: Routing
tmux new -s routing
python3 main.py headless board.ORP
# Detach (Ctrl+b, d)
# Window 2: Monitoring
tmux new -s monitor
tail -f logs/run_*.log | grep "ITER.*nets="
# Detach (Ctrl+b, d)
# Switch between:
tmux attach -t routing
tmux attach -t monitor
4. Estimate Time Remaining
# From iteration timestamps, calculate rate:
# Example: ITER 10 at 10:30, ITER 20 at 11:45
# = 10 iterations in 75 minutes
# = 7.5 min/iteration
# If need 80 iterations total: (80-20) × 7.5 = 450 min = 7.5 hours
5. Verify GPU is Being Used
# Run this DURING routing:
nvidia-smi
# Look for:
# GPU Util: 95-100%
# Memory Usage: 20-30 GB (should be high)
# Process: python3 main.py headless ...
If GPU Util is 0%: Routing is using CPU (slow!) - check CuPy installation.
6. Pre-test Small Board
Before routing huge board:
# Test with small ORP first:
python3 main.py headless TestBackplane.ORP
# Should complete in 20-30 min
# Verifies: GPU works, dependencies correct, no issues
7. Compress Logs to Save Disk
# While routing is running (in another terminal):
cd /workspace/OrthoRoute/logs
gzip run_2025*.log # Compress old logs
# Or auto-compress with cron:
(crontab -l; echo "*/30 * * * * gzip /workspace/OrthoRoute/logs/*.log 2>/dev/null") | crontab -
Troubleshooting
"No module named 'cupy'"
Problem: CuPy not installed
Fix:
pip3 install cupy-cuda12x
"CUDA initialization failed"
Problem: CUDA runtime mismatch
Fix:
# Check CUDA version
nvcc --version
# Install matching CuPy:
# CUDA 11.x: pip3 install cupy-cuda11x
# CUDA 12.x: pip3 install cupy-cuda12x
"Permission denied" when cloning repo
Problem: Private repository
Fix:
# Generate SSH key on Vast instance:
ssh-keygen -t ed25519 -C "vast-gpu"
cat ~/.ssh/id_ed25519.pub
# Copy output, add to GitHub → Settings → SSH Keys
# Or use personal access token:
git clone https://YOUR_TOKEN@github.com/user/repo.git
Routing uses CPU instead of GPU
Check:
python3 -c "import cupy; print('Available:', cupy.cuda.is_available())"
If False:
- CuPy not installed correctly
- CUDA version mismatch
- GPU drivers not loaded
Force GPU mode:
python3 main.py headless board.ORP --use-gpu
Instance runs out of disk space
Check space:
df -h
If <5 GB free:
# Compress logs
gzip logs/*.log
# Or delete old logs
rm logs/run_2025111*.log
# Or mount external storage (Vast.ai option)
Routing takes forever on CPU
If forced to use --cpu-only:
- 8K net board could take 48-72 hours
- Consider renting bigger GPU instead
- Or reduce grid resolution in ORP file
Optimization Tips
1. Choose Right GPU for Your Board
| Board Size | Nets | VRAM Needed | Recommended GPU | Cost/hr |
|---|---|---|---|---|
| Small | <500 | 8 GB | RTX 3080 | $0.25 |
| Medium | 500-2K | 16 GB | RTX 4090 | $0.40 |
| Large | 2K-6K | 24 GB | RTX 4090 | $0.40 |
| Huge | 6K-10K | 48 GB | RTX 6000 Ada | $0.80 |
| Massive | 10K+ | 80 GB | A100 80GB | $1.50 |
2. Batch Multiple Boards
# Route multiple boards in one session:
python3 main.py headless Board1.ORP
python3 main.py headless Board2.ORP
python3 main.py headless Board3.ORP
# Or in parallel (if enough VRAM):
python3 main.py headless Board1.ORP &
python3 main.py headless Board2.ORP &
wait
3. Auto-shutdown When Done
# Add to end of routing script:
python3 main.py headless board.ORP && shutdown -h now
# Instance stops automatically when complete
# Minimizes billing
Quick Reference Card
Setup:
ssh -p PORT root@ssh.vast.ai
cd /workspace
git clone https://github.com/user/OrthoRoute.git
cd OrthoRoute
pip3 install cupy-cuda12x numpy scipy
Upload file:
# From local machine:
scp -P PORT board.ORP root@ssh.vast.ai:/workspace/OrthoRoute/
Run routing:
tmux new -s routing
python3 main.py headless board.ORP
# Ctrl+b, d to detach
Monitor:
tail -f logs/run_*.log | grep "ITER.*nets="
nvidia-smi -l 5
Download result:
# From local machine:
scp -P PORT root@ssh.vast.ai:/workspace/OrthoRoute/board.ORS ./
Import to KiCad:
Ctrl+I → select board.ORS → Apply to KiCad
Expected Timeline (8K Net Board)
00:00 - Start instance, SSH in
00:05 - Clone repo, install dependencies
00:10 - Upload ORP file (depends on internet speed)
00:15 - Start routing in tmux
02:30 - Iteration 1 completes (greedy routing)
04:00 - Iteration 20 completes
08:00 - Iteration 50 completes
12:00 - Iteration 75 completes
14:00 - Convergence! (iteration 85-95)
14:05 - Download ORS file
14:10 - Destroy instance
Total: ~14 hours runtime, ~$12-15 cost
Vast.ai Specific Notes
Instance States
- Loading: Starting up (1-2 min)
- Running: Active and billable
- Stopped: Paused (not billable, but loses data)
- Destroyed: Terminated (stops billing)
Billing
- Billed per second of runtime
- Continues billing until you Destroy instance
- Check dashboard frequently when job completes
Data Persistence
/workspacedirectory persists across stops~/.ssh,/tmpdo NOT persist- Always destroy when done (or you keep paying)
Port Forwarding
SSH command includes port forwarding:
ssh -p 12345 root@ssh.vast.ai -L 8080:localhost:8080
You can ignore the -L 8080:localhost:8080 part for headless routing.
Other Cloud Providers
RunPod
Similar setup:
# SSH command from RunPod dashboard
ssh root@X.X.X.X -p 22
# Rest is identical to Vast.ai
Differences:
- Easier UI
- Slightly more expensive (~$0.50/hr for RTX 4090)
- Better reliability
- Jupyter notebook support (not needed for headless)
Lambda Labs
Setup:
ssh ubuntu@instance.lambdalabs.com
sudo apt-get install python3-pip
# Rest same as Vast.ai
Differences:
- More expensive (~$1.10/hr for A100)
- Very reliable
- Better for production workloads
- Fixed pricing (no bidding)
Security Notes
Protect Your ORP Files
ORP files contain your entire board design:
- Pad positions
- Net connectivity
- Design rules
Don't:
- Upload to public GitHub
- Share ORP files publicly
- Leave on instance after destroying
Do:
- Use private repositories
- Delete ORP/ORS from instance before destroying:
rm /workspace/OrthoRoute/*.ORP rm /workspace/OrthoRoute/*.ORS - Download and backup ORS files locally
SSH Key Security
Generate unique key for cloud instances:
ssh-keygen -t ed25519 -f ~/.ssh/vast_key
# Use ~/.ssh/vast_key instead of default key
# If compromised, only affects cloud instances
Post-Processing
After Downloading ORS
1. Verify file:
ls -lh MainController.ORS
# Should be ~500KB - 5MB depending on board size
2. Import to KiCad:
- Ctrl+I in OrthoRoute plugin
- Select ORS file
- Review in preview
3. Run DRC:
- Check for violations
- Expect ~300-500 via barrel conflicts (known limitation)
- Zero trace-trace violations (should be clean)
4. Manual cleanup (if needed):
- Fix barrel conflicts by moving vias 0.1-0.2mm
- Typically 30-60 minutes for large boards
FAQ
Q: Can I close my laptop while routing? A: Yes, if using tmux! Routing continues on the cloud.
Q: How do I know when it's done? A: Check tmux session or log files. Or set up email notification (advanced).
Q: What if I run out of money mid-routing? A: Vast.ai stops instance, routing lost. Add credits before starting.
Q: Can I pause and resume? A: Not currently. Checkpointing is a planned feature but not implemented.
Q: GPU seems idle during routing?
A: Check nvidia-smi. If 0%, CuPy isn't working. Use --cpu-only as fallback.
Q: Can I route multiple boards in parallel? A: Yes, if enough VRAM. 2 small boards on 1 GPU works. Large boards need dedicated GPU.
Last Updated: November 15, 2025 Tested On: Vast.ai, RunPod, Lambda Labs GPU Tested: RTX 4090, RTX 6000 Ada, A100 80GB Status: Production-ready