mirror of
https://github.com/bbenchoff/OrthoRoute.git
synced 2026-01-11 14:01:27 +00:00
600 lines
15 KiB
Markdown
600 lines
15 KiB
Markdown
# PathFinder Parameter Tuning Guide
|
||
|
||
This guide explains all the tunable parameters in OrthoRoute's PathFinder algorithm and how to adjust them for different board types and convergence issues.
|
||
|
||
## Quick Reference Table
|
||
|
||
| Parameter | Default | Range | Impact |
|
||
|-----------|---------|-------|--------|
|
||
| `pres_fac_mult` | 1.10-1.35 | 1.05-2.0 | How fast overused edges become expensive |
|
||
| `pres_fac_max` | 6.0-16.0 | 4.0-64.0 | Maximum pressure on overused edges |
|
||
| `hist_gain` | 0.15-0.20 | 0.05-0.35 | How fast history accumulates |
|
||
| `hist_cost_weight` | 10.0-16.0 | 5.0-25.0 | Relative importance of history vs present |
|
||
| `via_cost` | 0.5-0.7 | 0.3-2.0 | Penalty for using vias |
|
||
| `max_iterations` | 40-100 | 20-200 | When to give up |
|
||
| `BARREL_PHASE_1_ITERS` | 10-50 | 0-100 | When barrel penalties stop |
|
||
|
||
---
|
||
|
||
## Core PathFinder Parameters
|
||
|
||
### 1. Present Factor Multiplier (`pres_fac_mult`)
|
||
|
||
**What it does:** Controls how aggressively PathFinder escalates costs on overused edges each iteration.
|
||
|
||
**Formula:**
|
||
```
|
||
pres_fac(iter) = pres_fac_init × (pres_fac_mult)^(iter-1)
|
||
|
||
Example with mult=1.35:
|
||
Iter 1: pres_fac = 1.0
|
||
Iter 5: pres_fac = 3.3
|
||
Iter 10: pres_fac = 18.9
|
||
Iter 15: pres_fac = 107.9 (hits max)
|
||
```
|
||
|
||
**How to tune:**
|
||
|
||
| Value | Board Type | Effect |
|
||
|-------|-----------|---------|
|
||
| 1.05-1.10 | Sparse (ρ < 0.5) | Gentle, slow convergence |
|
||
| 1.15-1.25 | Normal (ρ 0.5-0.8) | Balanced |
|
||
| **1.30-1.40** | **Tight (ρ 0.8-1.0)** | **Aggressive, forces rerouting** |
|
||
| 1.50-2.00 | Over-congested (ρ > 1.0) | Very aggressive (may oscillate) |
|
||
|
||
**When to adjust:**
|
||
- **Diverging overuse?** Increase (try 1.35-1.50)
|
||
- **Oscillating?** Decrease (try 1.10-1.15)
|
||
- **Slow convergence?** Increase slightly
|
||
|
||
---
|
||
|
||
### 2. Present Factor Maximum (`pres_fac_max`)
|
||
|
||
**What it does:** Caps how high `pres_fac` can grow, preventing infinite costs.
|
||
|
||
**Default:** 6.0-16.0 depending on board density
|
||
|
||
**How to tune:**
|
||
|
||
| Value | Use Case |
|
||
|-------|----------|
|
||
| 4.0-6.0 | Sparse boards - don't need extreme pressure |
|
||
| 8.0-16.0 | Tight boards - need strong rerouting signals |
|
||
| 32.0-64.0 | Desperate - board barely routable |
|
||
|
||
**Trade-off:**
|
||
- **Too low:** PathFinder gives up early, accepts overuse
|
||
- **Too high:** Routes become extreme detours, slow convergence
|
||
|
||
**Rule of thumb:** Set to `pres_fac_mult^N` where N = expected iterations to convergence
|
||
|
||
---
|
||
|
||
### 3. History Gain (`hist_gain`)
|
||
|
||
**What it does:** Controls how fast PathFinder "remembers" overused edges across iterations.
|
||
|
||
**Formula:**
|
||
```
|
||
history[edge] += hist_gain × overuse[edge] (each iteration)
|
||
total_cost[edge] = present[edge] + history[edge]
|
||
```
|
||
|
||
**The balance problem:**
|
||
- **Too HIGH** (> 0.25): History dominates, fights present costs → oscillation
|
||
- **Too LOW** (< 0.10): Present dominates, no learning → repeated mistakes
|
||
- **Just right** (0.12-0.18): Balanced
|
||
|
||
**How to tune:**
|
||
|
||
| Value | Effect | When to Use |
|
||
|-------|--------|-------------|
|
||
| 0.05-0.10 | Minimal memory | Very responsive, but forgets quickly |
|
||
| **0.12-0.18** | **Balanced** | **Most boards** |
|
||
| 0.20-0.25 | Strong memory | Dense boards, but risks oscillation |
|
||
| 0.30-0.35 | Very strong | Rarely needed, often causes problems |
|
||
|
||
**Diagnostic:** Check `hist/pres ratio` in logs:
|
||
```
|
||
hist/pres ratio: 0.994 ← Good (history ~ present)
|
||
hist/pres ratio: 0.05 ← BAD: hist_gain too low!
|
||
hist/pres ratio: 10.0 ← BAD: hist_gain too high!
|
||
```
|
||
|
||
**Target ratio:** 0.5 - 2.0
|
||
|
||
---
|
||
|
||
### 4. History Cost Weight (`hist_cost_weight`)
|
||
|
||
**What it does:** Multiplies history before adding to total cost.
|
||
|
||
**Formula:**
|
||
```
|
||
total_cost = present + (hist_cost_weight × history)
|
||
```
|
||
|
||
**Board-adaptive formula:**
|
||
```python
|
||
hist_cost_weight = 7.0 # Base
|
||
+ (1.0 per 4 layers) # Layer bonus
|
||
+ (6.0 × (ρ - 0.7)) # Congestion bonus
|
||
```
|
||
|
||
**How to tune:**
|
||
- **Higher weight** (15-25): Emphasizes history, good for dense boards
|
||
- **Lower weight** (5-10): Emphasizes present, good for sparse boards
|
||
- **Default (10-15)**: Usually fine, auto-tuned based on ρ
|
||
|
||
**Rule:** If hist/pres ratio is off, adjust `hist_gain` first, not weight.
|
||
|
||
---
|
||
|
||
## Convergence Tuning
|
||
|
||
### 5. Maximum Iterations (`max_iterations`)
|
||
|
||
**What it does:** Stops routing after N iterations even if not converged.
|
||
|
||
**Board-adaptive defaults:**
|
||
```
|
||
ρ < 0.5: 40 iterations (converges fast)
|
||
ρ 0.5-0.8: 60 iterations (moderate)
|
||
ρ 0.8-1.0: 100 iterations (tight, needs time)
|
||
ρ > 1.0: 150+ iterations (may never converge)
|
||
```
|
||
|
||
**How to tune:**
|
||
- Check when test board converges (e.g., 75 iterations)
|
||
- Set max to 1.5× that value (safety margin)
|
||
- For production: Lower to save time if partial routing acceptable
|
||
|
||
---
|
||
|
||
### 6. Stagnation Patience (`stagnation_patience`)
|
||
|
||
**What it does:** Stops early if overuse doesn't improve for N iterations.
|
||
|
||
**Default:** 5 iterations
|
||
|
||
**Trade-off:**
|
||
- **Too low** (2-3): May quit before convergence
|
||
- **Too high** (10+): Wastes time on hopeless boards
|
||
|
||
**Example:**
|
||
```
|
||
If overuse is: 1000 → 980 → 975 → 977 → 978 → 979
|
||
After 5 iterations without significant improvement, stop.
|
||
```
|
||
|
||
---
|
||
|
||
## Via Parameters
|
||
|
||
### 7. Via Cost (`via_cost`)
|
||
|
||
**What it does:** Base penalty for using a via (layer change).
|
||
|
||
**Default:** 0.5-0.7 (half a grid step)
|
||
|
||
**How to tune:**
|
||
|
||
| Value | Effect |
|
||
|-------|--------|
|
||
| 0.3-0.4 | Encourage vias (more layer changes) |
|
||
| 0.5-0.7 | Balanced |
|
||
| 0.8-1.5 | Discourage vias (prefer planar routing) |
|
||
| 2.0+ | Strongly avoid vias (may cause congestion) |
|
||
|
||
**When to adjust:**
|
||
- **Too many vias?** Increase to 0.8-1.0
|
||
- **Routes not using layers?** Decrease to 0.4-0.5
|
||
- **Via barrel conflicts high?** Increase (makes vias less attractive)
|
||
|
||
---
|
||
|
||
## Barrel Conflict Parameters
|
||
|
||
### 8. Barrel Phase 1 Duration (`BARREL_PHASE_1_ITERS`)
|
||
|
||
**What it does:** How many iterations to apply barrel conflict penalties before disabling them.
|
||
|
||
**Location:** `unified_pathfinder.py` line ~3757
|
||
|
||
**How to tune:**
|
||
|
||
| Iterations | Board Size | Rationale |
|
||
|------------|-----------|-----------|
|
||
| 0 | Any | Disable barrel penalties entirely |
|
||
| 5-10 | Large (8k+ nets) | Get to Phase 2 fast |
|
||
| 20-30 | Medium (2k-5k nets) | Balanced approach |
|
||
| 40-50 | Small (< 1k nets) | Can afford longer Phase 1 |
|
||
|
||
**Phase 2 (penalties OFF) allows PathFinder to optimize without barrel constraint fighting convergence.**
|
||
|
||
**From today's experience:**
|
||
- Test board (512 nets): Phase 1 = 50 iterations worked perfectly
|
||
- Large board (8,192 nets): Phase 1 = 10 iterations better (less divergence)
|
||
|
||
---
|
||
|
||
### 9. Barrel Conflict Penalty Strength
|
||
|
||
**What it does:** How much to penalize edges passing through other nets' via barrels.
|
||
|
||
**Location:** `unified_pathfinder.py` line ~3765
|
||
|
||
**Formula:**
|
||
```python
|
||
conflict_penalty = min(multiplier × pres_fac, cap)
|
||
```
|
||
|
||
**Current settings:**
|
||
```python
|
||
# Phase 1
|
||
multiplier = 5.0 # Scale with pres_fac
|
||
cap = 50.0 # Maximum penalty
|
||
```
|
||
|
||
**How to tune:**
|
||
|
||
| Multiplier | Cap | Effect |
|
||
|------------|-----|--------|
|
||
| 2.0-5.0 | 30-50 | Light (large boards) |
|
||
| 10.0 | 100 | Medium (default) |
|
||
| 20.0 | 200+ | Aggressive (small boards) |
|
||
|
||
**Trade-off:**
|
||
- **Too strong:** Forces divergence (routes avoid barrels at all costs)
|
||
- **Too weak:** Doesn't prevent barrel conflicts
|
||
|
||
**Diagnostic:** If overuse grows in Phase 1, reduce multiplier/cap.
|
||
|
||
---
|
||
|
||
## Advanced Tuning
|
||
|
||
### 10. Layer Bias Alpha (`layer_bias` alpha)
|
||
|
||
**What it does:** Adjusts costs to balance load across layers dynamically.
|
||
|
||
**Location:** `_compute_layer_bias()` - alpha parameter
|
||
|
||
**Default:** 0.88 (strong smoothing)
|
||
|
||
**Range:** 0.5-0.95
|
||
|
||
- **Lower (0.5-0.7):** Quickly shifts routes to cooler layers
|
||
- **Higher (0.9-0.95):** Gradual shifts, prevents whiplash
|
||
|
||
---
|
||
|
||
### 11. ROI Extraction Threshold
|
||
|
||
**What it does:** Distance threshold for using focused ROI vs full-graph routing.
|
||
|
||
**Location:** `_route_all()` line ~4470
|
||
|
||
**Default:** 125 steps (50mm @ 0.4mm grid)
|
||
|
||
```python
|
||
ROI_THRESHOLD_STEPS = 125
|
||
```
|
||
|
||
**How to tune:**
|
||
- **Smaller (50-100):** More nets use ROI = faster but may fail long routes
|
||
- **Larger (150-200):** More nets use full graph = slower but more reliable
|
||
|
||
---
|
||
|
||
## Diagnostic Tuning Workflow
|
||
|
||
### Step 1: Check Congestion Ratio
|
||
|
||
```
|
||
ρ < 1.0 → Board is routable, tune for speed
|
||
ρ > 1.0 → Board is impossible, add layers FIRST
|
||
```
|
||
|
||
### Step 2: Run 10-20 Iterations and Check Trend
|
||
|
||
**If overuse is DECREASING:**
|
||
✅ Parameters are good, let it run
|
||
|
||
**If overuse is INCREASING:**
|
||
❌ Too gentle - increase `pres_fac_mult` to 1.3-1.5
|
||
|
||
**If overuse OSCILLATES (up/down/up/down):**
|
||
⚠️ Balance issue:
|
||
- Reduce `hist_gain` (try 0.12-0.15)
|
||
- Reduce barrel penalty if in Phase 1
|
||
|
||
### Step 3: Monitor Iteration Timing
|
||
|
||
With all GPU optimizations (from today):
|
||
|
||
| Board Size | Iter 1 | Iter 2+ | What's Slow? |
|
||
|------------|--------|---------|--------------|
|
||
| 512 nets | 2 min | 20 sec | Normal |
|
||
| 8,192 nets | 100 min | 2-3 min | Iteration 1 is inherently slow |
|
||
|
||
**If iterations 2+ are slow (>10 min):**
|
||
- Check if GPU optimizations are active
|
||
- Look for "BARREL-CONFLICT.*minutes" in logs
|
||
- Ensure Python cache is cleared
|
||
|
||
---
|
||
|
||
## Parameter Sets for Common Scenarios
|
||
|
||
### Scenario 1: Sparse Board (ρ < 0.5)
|
||
|
||
**Goal:** Fast routing
|
||
|
||
```python
|
||
pres_fac_mult = 1.15
|
||
pres_fac_max = 6.0
|
||
hist_gain = 0.12
|
||
max_iterations = 40
|
||
BARREL_PHASE_1_ITERS = 30
|
||
```
|
||
|
||
### Scenario 2: Normal Board (ρ 0.5-0.8)
|
||
|
||
**Goal:** Reliable convergence
|
||
|
||
```python
|
||
pres_fac_mult = 1.20
|
||
pres_fac_max = 10.0
|
||
hist_gain = 0.15
|
||
max_iterations = 60
|
||
BARREL_PHASE_1_ITERS = 30
|
||
```
|
||
|
||
### Scenario 3: Tight Board (ρ 0.8-1.0) ← YOUR BOARD
|
||
|
||
**Goal:** Force convergence without divergence
|
||
|
||
```python
|
||
pres_fac_mult = 1.35 # Aggressive!
|
||
pres_fac_max = 16.0 # High ceiling
|
||
hist_gain = 0.15 # Balanced
|
||
max_iterations = 100
|
||
BARREL_PHASE_1_ITERS = 10 # Short Phase 1
|
||
barrel_penalty_mult = 5.0 # Light penalties
|
||
barrel_penalty_cap = 50.0
|
||
```
|
||
|
||
### Scenario 4: Over-Congested (ρ > 1.0)
|
||
|
||
**Goal:** Add layers first, then tune
|
||
|
||
```
|
||
⚠️ STOP! Add layers until ρ < 1.0
|
||
No amount of parameter tuning will make impossible boards routable.
|
||
```
|
||
|
||
---
|
||
|
||
## Two-Phase Barrel Conflict Strategy
|
||
|
||
**Phase 1:** Apply barrel conflict penalties
|
||
- Reduces barrel conflicts quickly
|
||
- May increase overall congestion temporarily
|
||
- **Keep short** (10-20 iterations for large boards)
|
||
|
||
**Phase 2:** Remove penalties, let PathFinder optimize
|
||
- Converges to minimal overuse
|
||
- Barrel conflicts may increase slightly but stay manageable
|
||
- **Where actual convergence happens**
|
||
|
||
**Example from test board (512 nets):**
|
||
```
|
||
Phase 1 (iters 1-50):
|
||
Barrel conflicts: 2,611 → 290 (89% reduction!)
|
||
|
||
Phase 2 (iters 51-75):
|
||
Barrel conflicts: stable at ~300
|
||
Overall overuse: 25,000 → 0 (CONVERGED!)
|
||
```
|
||
|
||
**Example from large board (8,192 nets):**
|
||
```
|
||
Phase 1 (iters 1-10): Short to avoid divergence
|
||
Phase 2 (iters 11+): Where convergence should happen
|
||
```
|
||
|
||
---
|
||
|
||
## Advanced: History vs Present Balance
|
||
|
||
The **hist/pres ratio** in logs shows the balance:
|
||
|
||
```
|
||
[HISTORY-DEBUG] Iter 10:
|
||
hist/pres ratio: 0.994 ← GOOD
|
||
```
|
||
|
||
**Target: 0.5 - 2.0**
|
||
|
||
**If ratio < 0.1** (history too weak):
|
||
- Increase `hist_gain` (try +0.05)
|
||
- OR increase `hist_cost_weight` (try +3.0)
|
||
|
||
**If ratio > 5.0** (history too strong):
|
||
- Decrease `hist_gain` (try -0.05)
|
||
- OR decrease `hist_cost_weight` (try -3.0)
|
||
|
||
**If ratio oscillates wildly:**
|
||
- History and present are fighting
|
||
- Reduce `hist_gain` to 0.10-0.12
|
||
- May need to reduce `pres_fac_mult` slightly
|
||
|
||
---
|
||
|
||
## GPU vs CPU Performance Tuning
|
||
|
||
### Memory vs Speed Trade-offs
|
||
|
||
**22 layers on 8,192-net board:**
|
||
- 181M edges
|
||
- ~10 GB GPU memory needed
|
||
- If GPU OOM: Reduce layers or use CPU fallback
|
||
|
||
**CPU fallback implications:**
|
||
- Routing still works
|
||
- ~2-3× slower
|
||
- Via pooling penalties disabled (minor)
|
||
|
||
### When GPU Memory is Tight
|
||
|
||
1. **Reduce layers** (20 → 18 saves ~2 GB)
|
||
2. **Disable via pooling** (already done for >100M edges)
|
||
3. **Use smaller grid pitch** (0.4mm → 0.5mm = fewer nodes)
|
||
|
||
---
|
||
|
||
## Common Issues and Fixes
|
||
|
||
### Issue: Overuse Growing Each Iteration
|
||
|
||
**Symptoms:**
|
||
```
|
||
Iter 1: 2.4M overuse
|
||
Iter 5: 4.4M overuse (getting worse!)
|
||
```
|
||
|
||
**Fixes:**
|
||
1. **Increase `pres_fac_mult`** to 1.35-1.50
|
||
2. **Reduce barrel penalty** if in Phase 1
|
||
3. Check if ρ > 1.0 (may be impossible to route)
|
||
|
||
---
|
||
|
||
### Issue: Oscillation (Overuse Ping-Pongs)
|
||
|
||
**Symptoms:**
|
||
```
|
||
Iter 10: 5,000
|
||
Iter 11: 2,000
|
||
Iter 12: 6,000
|
||
Iter 13: 1,500
|
||
(never settles)
|
||
```
|
||
|
||
**Fixes:**
|
||
1. **Reduce `hist_gain`** to 0.10-0.12
|
||
2. **Reduce `pres_fac_mult`** to 1.15-1.20
|
||
3. Check if in Phase 1 - may need to reach Phase 2
|
||
|
||
---
|
||
|
||
### Issue: Slow Iterations (>10 min for iter 2+)
|
||
|
||
**Symptoms:**
|
||
- Iteration 2+ taking 30-60 minutes
|
||
- Should be 2-5 minutes with GPU
|
||
|
||
**Fixes:**
|
||
1. Clear Python cache: `find . -name "*.pyc" -delete`
|
||
2. Check GPU optimizations are active (look for "GPU" in logs)
|
||
3. Restart Python (clears memory pools)
|
||
4. Check barrel conflict timing - should be <1 second
|
||
|
||
---
|
||
|
||
## Testing Your Changes
|
||
|
||
After tuning parameters:
|
||
|
||
1. **Test on small board first** (512 nets, 18 layers)
|
||
- Should converge in < 2 hours
|
||
- Validates parameters work
|
||
|
||
2. **Then try large board** (8,192 nets)
|
||
- Expect ~4-8 hours
|
||
- Monitor first 10 iterations
|
||
|
||
3. **Check convergence trend:**
|
||
```bash
|
||
grep "\[ITER.*\] routed" logs/latest.log
|
||
```
|
||
|
||
Look for **decreasing overuse**:
|
||
```
|
||
Iter 1: 2,451,307
|
||
Iter 5: 1,800,000 ✓ (decreasing = good!)
|
||
Iter 10: 950,000 ✓
|
||
```
|
||
|
||
---
|
||
|
||
## Emergency: Board Won't Converge
|
||
|
||
If you've tried everything and it still won't converge:
|
||
|
||
### Option 1: Check Physical Constraints
|
||
```bash
|
||
grep "Congestion ratio" logs/latest.log
|
||
```
|
||
|
||
If ρ > 1.0: **Add more layers** (not a parameter problem!)
|
||
|
||
### Option 2: Disable Barrel Penalties Entirely
|
||
```python
|
||
BARREL_PHASE_1_ITERS = 0 # Skip Phase 1 completely
|
||
```
|
||
|
||
This may leave barrel conflicts but allows other routing to converge.
|
||
|
||
### Option 3: Reduce Grid Pitch
|
||
Coarser grid = fewer nodes = easier routing:
|
||
```
|
||
grid_pitch: 0.4mm → 0.5mm or 0.6mm
|
||
```
|
||
|
||
Trade-off: Less routing precision.
|
||
|
||
---
|
||
|
||
## File Locations for Parameter Changes
|
||
|
||
### Automatic (board-adaptive):
|
||
- `orthoroute/algorithms/manhattan/parameter_derivation.py`
|
||
- Lines 70-95: pres_fac_mult by congestion
|
||
- Lines 128-130: hist_gain
|
||
- Lines 120-126: hist_cost_weight
|
||
|
||
### Manual overrides:
|
||
- `orthoroute/algorithms/manhattan/unified_pathfinder.py`
|
||
- Line 3757: BARREL_PHASE_1_ITERS
|
||
- Line 3765: Barrel penalty formula
|
||
|
||
### Config file (future):
|
||
- Could add to `orthoroute.json` for per-project tuning
|
||
|
||
---
|
||
|
||
## References
|
||
|
||
- McMurchie & Ebeling, "PathFinder: A Negotiation-Based Performance-Driven Router for FPGAs" (1995)
|
||
- Betz & Rose, "VPR: A New Packing, Placement and Routing Tool for FPGA Research" (1997)
|
||
- OrthoRoute extends these FPGA concepts to heterogeneous PCB routing
|
||
|
||
---
|
||
|
||
**Pro Tip:** Document your parameter changes in git commits. It's easy to forget what you tried!
|
||
|
||
**Example commit message:**
|
||
```
|
||
Tune for 8k-net board convergence
|
||
|
||
- pres_fac_mult: 1.10 → 1.35 (aggressive rerouting)
|
||
- hist_gain: 0.20 → 0.15 (reduce history dominance)
|
||
- BARREL_PHASE_1_ITERS: 50 → 10 (faster to Phase 2)
|
||
|
||
Board: ρ=0.915, 22 layers, 8,192 nets
|
||
Expected: Convergence in ~50-75 iterations
|
||
```
|