Skip to content

Performance Troubleshooting

This guide helps identify and resolve performance issues with Bifrost Proxy.

Before troubleshooting, gather baseline performance data.

Terminal window
# Single request timing
time curl -x http://localhost:7080 https://httpbin.org/ip -o /dev/null -s
# Detailed timing breakdown
curl -x http://localhost:7080 https://httpbin.org/ip -o /dev/null -s -w \
"DNS: %{time_namelookup}s\nConnect: %{time_connect}s\nTLS: %{time_appconnect}s\nStart: %{time_starttransfer}s\nTotal: %{time_total}s\n"
# Multiple requests average
for i in {1..10}; do
curl -x http://localhost:7080 https://httpbin.org/ip -o /dev/null -s -w "%{time_total}\n"
done | awk '{sum+=$1} END {print "Average:", sum/NR, "seconds"}'
Terminal window
# Overall statistics
curl -s http://localhost:7082/api/v1/stats | jq
# Backend latency
curl -s http://localhost:7082/api/v1/backends | jq '.[].stats'
# Active connections
curl -s http://localhost:7082/api/v1/stats | jq '.active_connections'
Terminal window
# CPU and memory usage
top -p $(pgrep bifrost-server)
# Detailed process stats
ps aux | grep bifrost
# Memory usage over time
watch -n 5 'ps aux | grep bifrost | awk "{print \$6/1024\" MB\"}"'
# Goroutine count (if metrics enabled)
curl -s http://localhost:7090/metrics | grep bifrost_goroutines

Symptoms: Requests take longer than expected to complete.

The upstream backend is slow to respond.

Diagnosis:

Terminal window
# Compare proxy vs direct timing
echo "Via proxy:"
time curl -x http://localhost:7080 https://example.com -o /dev/null -s
echo "Direct:"
time curl https://example.com -o /dev/null -s
# Check backend health and latency
curl -s http://localhost:7082/api/v1/backends | jq '.[] | {name, healthy, latency: .stats.avg_latency_ms}'

Solution:

  1. Switch to a faster backend:

    routes:
    - domains: ["*"]
    backends:
    - fast-backend
  2. Use load balancing with health checks:

    routes:
    - domains: ["*"]
    backends:
    - backend1
    - backend2
    load_balance: least_conn # Route to backend with lowest latency
  3. Enable caching for repeated requests:

    cache:
    enabled: true
    memory:
    max_size: "256MB"

DNS lookups are adding latency.

Diagnosis:

Terminal window
# Measure DNS lookup time
time nslookup example.com
# Check DNS timing in curl
curl -x http://localhost:7080 https://example.com -o /dev/null -s -w "DNS: %{time_namelookup}s\n"

Solution:

  1. Use faster DNS servers:

    backends:
    - name: wg-vpn
    type: wireguard
    config:
    dns:
    - "1.1.1.1" # Cloudflare (typically fast)
    - "8.8.8.8" # Google
  2. Enable DNS caching:

    vpn:
    dns:
    enabled: true
    cache_ttl: "5m"

Each request creates a new connection.

Diagnosis:

Terminal window
# Check if keep-alive is working
curl -x http://localhost:7080 https://example.com -v 2>&1 | grep -i keep-alive
# Check connection reuse
curl -x http://localhost:7080 https://example.com https://example.com/path -v 2>&1 | grep "Re-using"

Solution:

Enable and tune keep-alive settings:

server:
http:
idle_timeout: "120s" # Keep connections alive longer
max_idle_conns_per_host: 100

TLS negotiation adds latency for each new connection.

Diagnosis:

Terminal window
# Measure TLS handshake time
curl -x http://localhost:7080 https://example.com -o /dev/null -s -w "TLS handshake: %{time_appconnect}s\n"

Solution:

  1. Enable connection keep-alive (reduces handshakes)
  2. Use TLS session resumption (automatic in Go)
  3. Consider HTTP/2 for multiplexed connections

Symptoms: Bifrost process consuming excessive CPU.

High connection count requires more processing.

Diagnosis:

Terminal window
# Check connection count
curl -s http://localhost:7082/api/v1/stats | jq '.active_connections'
# Check requests per second
curl -s http://localhost:7090/metrics | grep bifrost_requests_total

Solution:

  1. Implement rate limiting:

    rate_limit:
    enabled: true
    requests_per_second: 100
    burst: 200
  2. Use connection limits:

    server:
    http:
    max_connections: 10000

Debug logging can be CPU-intensive.

Diagnosis:

Terminal window
# Check current log level
grep -i "level" /etc/bifrost/config.yaml
# Monitor log output rate
tail -f /var/log/bifrost/server.log | pv -l > /dev/null

Solution:

Reduce log level in production:

logging:
level: warn # Or 'error' for minimal logging
format: json # More efficient than text

Complex regex patterns in routing rules.

Diagnosis:

Check for regex patterns in routes:

# Potentially expensive
routes:
- domains: ["*.complex-pattern-.*\\.example\\.com"]

Solution:

Simplify routing patterns:

# More efficient
routes:
- domains: ["*.example.com"]

WireGuard or other encryption consuming CPU.

Diagnosis:

Terminal window
# Check backend-specific CPU usage
# Compare latency with encryption vs direct
# Test direct backend
curl -x http://localhost:7080 -H "X-Backend: direct" https://example.com -o /dev/null -s -w "%{time_total}\n"

Solution:

  1. Use hardware-accelerated encryption if available
  2. Consider CPU architecture with AES-NI support
  3. For high-throughput scenarios, consider direct backend for non-sensitive traffic

Symptoms: Memory consumption grows over time or is excessive.

Large request log consumes memory.

Diagnosis:

Terminal window
# Check request log size
curl -s http://localhost:7082/api/v1/requests | jq 'length'

Solution:

api:
request_log_size: 500 # Reduce from default 1000
# Or disable entirely
enable_request_log: false

In-memory cache consuming too much RAM.

Diagnosis:

Terminal window
# Check cache statistics
curl -s http://localhost:7082/api/v1/cache/stats | jq
# Check via metrics
curl -s http://localhost:7090/metrics | grep bifrost_cache

Solution:

Limit cache memory usage:

cache:
memory:
max_size: "128MB" # Limit memory cache
max_items: 10000
# Use disk cache for larger storage
disk:
enabled: true
path: "/var/cache/bifrost"
max_size: "1GB"

Connection pools growing unbounded.

Solution:

Configure connection pool limits:

server:
http:
max_idle_conns: 100
max_idle_conns_per_host: 10
idle_conn_timeout: "90s"

Gradual memory growth without release.

Diagnosis:

Terminal window
# Monitor memory over time
while true; do
ps aux | grep bifrost | awk '{print strftime("%H:%M:%S"), $6/1024, "MB"}'
sleep 60
done
# Check goroutine count for leaks
curl -s http://localhost:7090/metrics | grep bifrost_goroutines

Solution:

  1. Restart the service as a temporary fix
  2. Check for latest version with bug fixes
  3. Report the issue with memory profiles

Symptoms: Network throughput is lower than expected.

Packets are being fragmented, reducing throughput.

Diagnosis:

Terminal window
# Test with different packet sizes
ping -M do -s 1400 example.com
ping -M do -s 1200 example.com
# Check for fragmentation in netstat
netstat -s | grep -i fragment

Solution:

# For WireGuard
backends:
- name: wg-vpn
type: wireguard
config:
mtu: 1280 # Conservative value
# For VPN mode
vpn:
mtu: 1280

Network buffers limiting throughput.

Solution:

Increase system buffer sizes:

Terminal window
# Linux
sudo sysctl -w net.core.rmem_max=26214400
sudo sysctl -w net.core.wmem_max=26214400
sudo sysctl -w net.ipv4.tcp_rmem="4096 87380 26214400"
sudo sysctl -w net.ipv4.tcp_wmem="4096 65536 26214400"
# Make permanent in /etc/sysctl.conf

The backend connection has limited bandwidth.

Diagnosis:

Terminal window
# Speed test through proxy
curl -x http://localhost:7080 https://speed.cloudflare.com/__down?bytes=100000000 -o /dev/null -s -w "Speed: %{speed_download} bytes/sec\n"
# Compare direct
curl https://speed.cloudflare.com/__down?bytes=100000000 -o /dev/null -s -w "Speed: %{speed_download} bytes/sec\n"

Solution:

Use multiple backends with load balancing for higher aggregate throughput.


Terminal window
# Increase file descriptor limits
ulimit -n 65536
# Optimize TCP settings
sysctl -w net.core.somaxconn=65535
sysctl -w net.ipv4.tcp_max_syn_backlog=65535
sysctl -w net.ipv4.tcp_fin_timeout=30
sysctl -w net.ipv4.tcp_tw_reuse=1
# Increase buffer sizes
sysctl -w net.core.rmem_max=26214400
sysctl -w net.core.wmem_max=26214400
# Optimized production configuration
logging:
level: warn
format: json
server:
http:
listen: ":7080"
read_timeout: "30s"
write_timeout: "30s"
idle_timeout: "120s"
max_connections: 50000
api:
enable_request_log: false # Disable for performance
cache:
enabled: true
memory:
max_size: "256MB"
disk:
enabled: true
path: "/var/cache/bifrost"
max_size: "2GB"
metrics:
enabled: true
listen: ":7090"
# Key metrics to watch:
# - bifrost_request_duration_seconds
# - bifrost_connections_active
# - bifrost_backend_latency_seconds
# - bifrost_goroutines
# - bifrost_memory_bytes

MetricHealthy ValueWarning Threshold
Request latency (p95)< 100ms> 500ms
Active connections< 10,000> 50,000
Error rate< 0.1%> 1%
CPU usage< 50%> 80%
Memory usage< 512MB> 2GB
Goroutine count< 10,000> 50,000
Terminal window
# Quick performance check
curl -s http://localhost:7090/metrics | grep -E "(bifrost_request_duration|bifrost_connections_active|bifrost_memory)"