Performance Tuning
Overview
The Hyperliquid Trading Agent is designed for startup-scale operations (1-4 engineers) with efficient resource usage and minimal infrastructure overhead. This guide covers performance optimization strategies for signal collection, caching, API usage, and concurrent request handling.
The system uses SQLite for caching (no Redis required), async I/O for concurrent data fetching, and configurable timeouts to balance latency with reliability.
Cache TTL Tuning Guidelines
Understanding TTL Trade-offs
Cache Time-To-Live (TTL) values control how long data remains valid before requiring a fresh fetch. Tuning TTL involves balancing:
Shorter TTL (more frequent updates):
- ✅ Fresher data for better decision accuracy
- ✅ Reduced risk of stale data during volatile markets
- ❌ Higher API call volume and costs
- ❌ Increased latency from more network requests
- ❌ Higher rate limit risk
Longer TTL (less frequent updates):
- ✅ Lower API call volume and costs
- ✅ Reduced latency from cache hits
- ✅ Better rate limit compliance
- ❌ Staler data may reduce decision quality
- ❌ Slower response to market changes
Default TTL Values
The system uses signal-type-appropriate TTL values:
# Fast signals - execution-level data (changes rapidly)
order_book_ttl = 5 # 5 seconds
mid_price_ttl = 10 # 10 seconds
spread_ttl = 5 # 5 seconds
# Medium signals - tactical data (changes moderately)
candles_ttl = 300 # 5 minutes
funding_ttl = 600 # 10 minutes
open_interest_ttl = 600 # 10 minutes
technical_indicators_ttl = 300 # 5 minutes
# Slow signals - macro data (changes slowly)
macro_events_ttl = 3600 # 1 hour
correlation_ttl = 3600 # 1 hour
sentiment_ttl = 1800 # 30 minutes
token_unlocks_ttl = 86400 # 24 hoursTuning Recommendations by Use Case
High-Frequency Trading (HFT)
For strategies requiring sub-second execution:
# Aggressive freshness - prioritize data quality over cost
order_book_ttl = 2 # 2 seconds
mid_price_ttl = 5 # 5 seconds
candles_ttl = 60 # 1 minute
funding_ttl = 300 # 5 minutesTrade-offs: 5-10x higher API costs, requires higher rate limits
Medium-Frequency Trading (MFT)
For strategies with 5-15 minute decision cycles:
# Balanced approach - default values work well
order_book_ttl = 10 # 10 seconds
mid_price_ttl = 30 # 30 seconds
candles_ttl = 600 # 10 minutes
funding_ttl = 1800 # 30 minutesTrade-offs: Moderate API costs, good data freshness
Low-Frequency Trading (LFT)
For strategies with hourly or daily decision cycles:
# Cost-optimized - prioritize cache hits
order_book_ttl = 30 # 30 seconds
mid_price_ttl = 60 # 1 minute
candles_ttl = 1800 # 30 minutes
funding_ttl = 3600 # 1 hourTrade-offs: Lowest API costs, acceptable staleness for slow strategies
Monitoring Cache Performance
Track cache metrics to optimize TTL values:
from hyperliquid_agent.signals.orchestrator import SignalOrchestrator
orchestrator = SignalOrchestrator(config)
metrics = orchestrator.get_cache_metrics()
print(f"Hit rate: {metrics['hit_rate_percent']}%")
print(f"Total entries: {metrics['total_entries']}")
print(f"Avg age: {metrics['avg_age_seconds']}s")
print(f"Expired entries: {metrics['expired_entries']}")Target Metrics:
- Hit rate: 60-80% for well-tuned TTL values
- Avg age: Should be < 50% of TTL (indicates good cache utilization)
- Expired entries: < 100 (indicates timely cleanup)
Optimization Guidelines:
- Hit rate < 40%: TTL too short, increase by 50-100%
- Hit rate > 90%: TTL too long, data may be stale, decrease by 30-50%
- Avg age > 80% of TTL: Cache underutilized, consider shorter TTL
- Expired entries > 500: Increase cleanup frequency
Dynamic TTL Adjustment
For advanced users, implement dynamic TTL based on market conditions:
def get_adaptive_ttl(base_ttl: int, volatility: float) -> int:
"""Adjust TTL based on market volatility.
Args:
base_ttl: Base TTL in seconds
volatility: Realized volatility (0.0 to 1.0+)
Returns:
Adjusted TTL in seconds
"""
# High volatility = shorter TTL for fresher data
if volatility > 0.05: # > 5% volatility
return int(base_ttl * 0.5) # 50% shorter
elif volatility > 0.03: # > 3% volatility
return int(base_ttl * 0.75) # 25% shorter
else:
return base_ttl # Normal TTLCache Cleanup Configuration
Configure periodic cleanup to remove expired entries:
[signals.cache]
cleanup_interval_seconds = 3600 # Run cleanup every hour
vacuum_on_startup = true # Optimize DB on startup
max_size_mb = 100 # Alert if DB exceeds 100MBCleanup Frequency Guidelines:
- High-frequency trading: 1800s (30 minutes) - prevents DB bloat
- Medium-frequency trading: 3600s (1 hour) - default, good balance
- Low-frequency trading: 7200s (2 hours) - minimal overhead
Concurrent Request Optimization
Understanding max_concurrent_requests
The signal orchestrator uses asyncio.gather() to fetch data concurrently, spawning multiple async tasks in parallel. This dramatically reduces latency compared to sequential fetching.
Example: Fetching order books for 10 positions
- Sequential: 10 positions × 200ms = 2000ms total
- Concurrent: max(200ms) = 200ms total (10x faster)
Default Concurrency Settings
The system uses Python's asyncio with no hard limit on concurrent tasks:
# Fast loop - concurrent order book fetching
tasks = [self.fetch_order_book(coin) for coin in coins]
results = await asyncio.gather(*tasks, return_exceptions=True)
# Medium loop - concurrent funding + OI + candles
funding_tasks = [self.fetch_funding_history(coin, ...) for coin in coins]
oi_tasks = [self.fetch_open_interest(coin) for coin in coins]
candles_tasks = [self.fetch_candles(coin, ...) for coin in coins]
results = await asyncio.gather(
*funding_tasks, *oi_tasks, *candles_tasks,
return_exceptions=True
)Tuning Guidelines
Conservative (Low Rate Limits)
For accounts with strict rate limits or shared API keys:
# Implement semaphore-based rate limiting
import asyncio
class RateLimitedProvider:
def __init__(self, max_concurrent: int = 5):
self.semaphore = asyncio.Semaphore(max_concurrent)
async def fetch_with_limit(self, fetch_func, *args):
async with self.semaphore:
return await fetch_func(*args)
# Usage
provider = RateLimitedProvider(max_concurrent=5)
tasks = [provider.fetch_with_limit(self.fetch_order_book, coin) for coin in coins]
results = await asyncio.gather(*tasks)Settings: max_concurrent=5 limits to 5 parallel requests
Balanced (Default)
For typical usage with standard rate limits:
# No explicit limit - rely on asyncio's natural concurrency
# Hyperliquid API typically handles 10-20 concurrent requests well
tasks = [self.fetch_order_book(coin) for coin in coins]
results = await asyncio.gather(*tasks, return_exceptions=True)Settings: No limit, typically 10-20 concurrent requests
Aggressive (High Rate Limits)
For premium accounts or dedicated infrastructure:
# Batch large request sets with controlled parallelism
async def fetch_in_batches(tasks, batch_size=50):
results = []
for i in range(0, len(tasks), batch_size):
batch = tasks[i:i + batch_size]
batch_results = await asyncio.gather(*batch, return_exceptions=True)
results.extend(batch_results)
return results
# Usage for 100+ positions
tasks = [self.fetch_order_book(coin) for coin in coins]
results = await fetch_in_batches(tasks, batch_size=50)Settings: Batch size of 50-100 for large portfolios
Load Testing Examples
Test your configuration under realistic load:
import asyncio
import time
from hyperliquid_agent.signals.orchestrator import SignalOrchestrator
async def load_test_concurrent_fetching():
"""Test concurrent signal collection performance."""
orchestrator = SignalOrchestrator()
# Simulate 20 positions
test_coins = ["BTC", "ETH", "SOL", "AVAX", "MATIC",
"ARB", "OP", "DOGE", "SHIB", "PEPE",
"WIF", "BONK", "JTO", "JUP", "PYTH",
"TIA", "DYM", "ALT", "STRK", "W"]
# Test fast signal collection
start = time.time()
tasks = [orchestrator.hl_provider.fetch_order_book(coin) for coin in test_coins]
results = await asyncio.gather(*tasks, return_exceptions=True)
duration = time.time() - start
success_count = sum(1 for r in results if not isinstance(r, Exception))
print(f"Fetched {success_count}/{len(test_coins)} order books in {duration:.2f}s")
print(f"Avg latency: {duration / len(test_coins) * 1000:.0f}ms per request")
print(f"Effective concurrency: {len(test_coins) / duration:.1f}x")
# Run test
asyncio.run(load_test_concurrent_fetching())Expected Results:
- Good: 20 order books in 0.5-1.0s (20-40x speedup)
- Acceptable: 20 order books in 1.0-2.0s (10-20x speedup)
- Poor: 20 order books in > 2.0s (< 10x speedup, check network/API)
Timeout Configuration
Configure timeouts per signal type to prevent hanging requests:
[signals]
timeout_seconds = 30.0 # Global timeout
fast_timeout_seconds = 5.0 # Fast loop timeout
medium_timeout_seconds = 15.0 # Medium loop timeout
slow_timeout_seconds = 30.0 # Slow loop timeoutTuning Guidelines:
- Fast signals: 3-10s (execution-critical, fail fast)
- Medium signals: 10-20s (tactical, allow retries)
- Slow signals: 20-60s (macro, comprehensive data)
API Call Minimization Strategies
Caching Best Practices
Maximize cache hit rate to reduce API calls:
1. Enable Caching Globally
[signals]
caching_enabled = true
db_path = "state/signal_cache.db"2. Use Appropriate TTL Values
Match TTL to data update frequency:
# Data that changes every second
order_book_ttl = 5 # Cache for 5 seconds
# Data that changes every minute
funding_ttl = 600 # Cache for 10 minutes
# Data that changes daily
macro_events_ttl = 3600 # Cache for 1 hour3. Leverage Cache Warming
Pre-populate cache before trading starts:
async def warm_cache(orchestrator, watchlist):
"""Pre-fetch data for watchlist to populate cache."""
tasks = []
# Warm order books
for coin in watchlist:
tasks.append(orchestrator.hl_provider.fetch_order_book(coin))
# Warm funding history
for coin in watchlist:
tasks.append(orchestrator.hl_provider.fetch_funding_history(coin, ...))
# Execute all warming tasks
await asyncio.gather(*tasks, return_exceptions=True)
print(f"Cache warmed for {len(watchlist)} coins")4. Monitor Cache Metrics
Track hit rate and adjust TTL:
metrics = orchestrator.get_cache_metrics()
if metrics['hit_rate_percent'] < 50:
print("⚠️ Low cache hit rate - consider increasing TTL")Batch Request Patterns
Group related requests to minimize round trips:
1. Batch Order Book Fetching
# ❌ Bad: Sequential fetching
for coin in coins:
order_book = await fetch_order_book(coin)
process(order_book)
# ✅ Good: Concurrent batch fetching
tasks = [fetch_order_book(coin) for coin in coins]
order_books = await asyncio.gather(*tasks)
for order_book in order_books:
process(order_book)2. Batch Historical Data Fetching
# Fetch all historical data types concurrently
async def fetch_all_history(coin):
funding_task = fetch_funding_history(coin, ...)
oi_task = fetch_open_interest(coin)
candles_task = fetch_candles(coin, ...)
funding, oi, candles = await asyncio.gather(
funding_task, oi_task, candles_task
)
return {"funding": funding, "oi": oi, "candles": candles}3. Reuse Shared Data
Avoid fetching the same data multiple times:
# ❌ Bad: Fetch BTC price multiple times
btc_price_1 = await fetch_mid_price("BTC")
# ... later ...
btc_price_2 = await fetch_mid_price("BTC")
# ✅ Good: Fetch once, reuse
btc_price = await fetch_mid_price("BTC")
# ... use btc_price everywhere ...Cost Reduction Tips
1. Disable Unused Signal Providers
Turn off providers you don't need:
[signals.onchain]
enabled = false # Disable if not using on-chain data
[signals.external_market]
enabled = false # Disable if not using external market data
[signals.sentiment]
enabled = false # Disable if not using sentiment data2. Reduce Slow Loop Frequency
Slow signals change infrequently:
[governance]
slow_loop_interval_hours = 24 # Run once per day instead of hourly3. Use Free Data Sources
Prefer free providers when available:
[signals.external_market]
use_yfinance = true # FREE - no API key required
use_coingecko = true # FREE tier available
coingecko_api_key = "" # Optional for higher limits
[signals.sentiment]
use_fear_greed_index = true # FREE - no API key required4. Implement Request Deduplication
Avoid duplicate requests within short time windows:
class RequestDeduplicator:
def __init__(self, window_seconds=1.0):
self.pending = {}
self.window = window_seconds
async def fetch_deduplicated(self, key, fetch_func):
# If request already pending, wait for it
if key in self.pending:
return await self.pending[key]
# Create new request
task = asyncio.create_task(fetch_func())
self.pending[key] = task
try:
result = await task
return result
finally:
# Clean up after window expires
await asyncio.sleep(self.window)
self.pending.pop(key, None)Performance Metrics to Track
Signal Collection Latency
Monitor how long signal collection takes:
import time
async def collect_with_timing(orchestrator, signal_type, account_state):
start = time.time()
response = await orchestrator.collect_signals(
SignalRequest(signal_type=signal_type, account_state=account_state)
)
duration = time.time() - start
print(f"{signal_type} signals collected in {duration:.2f}s")
print(f" Confidence: {response.signals.metadata.confidence:.2f}")
print(f" Sources: {response.signals.metadata.sources}")
print(f" Cached: {response.signals.metadata.is_cached}")
return response, durationTarget Latencies:
- Fast signals: < 1s (execution-critical)
- Medium signals: < 5s (tactical planning)
- Slow signals: < 15s (macro analysis)
Decision Engine Latency
Track LLM response times:
import time
async def make_decision_with_timing(agent, state):
start = time.time()
decision = await agent.make_decision(state)
duration = time.time() - start
print(f"Decision made in {duration:.2f}s")
print(f" Token usage: {decision.token_usage}")
print(f" Cost: ${decision.cost:.4f}")
return decision, durationTarget Latencies:
- Fast loop: < 2s (quick decisions)
- Medium loop: < 10s (tactical planning)
- Slow loop: < 30s (comprehensive analysis)
Execution Latency
Monitor trade execution speed:
import time
async def execute_with_timing(executor, action):
start = time.time()
result = await executor.execute_action(action)
duration = time.time() - start
print(f"Trade executed in {duration:.2f}s")
print(f" Success: {result.success}")
print(f" Fill price: {result.fill_price}")
return result, durationTarget Latencies:
- Market orders: < 1s (immediate execution)
- Limit orders: < 2s (order placement)
- Cancellations: < 1s (risk management)
End-to-End Latency
Track complete decision cycle:
async def full_cycle_timing(agent):
start = time.time()
# 1. Collect signals
t1 = time.time()
state = await agent.monitor.get_current_state_with_signals("fast")
signal_time = time.time() - t1
# 2. Make decision
t2 = time.time()
decision = await agent.make_decision(state)
decision_time = time.time() - t2
# 3. Execute trades
t3 = time.time()
results = await agent.execute_trades(decision.actions)
execution_time = time.time() - t3
total_time = time.time() - start
print(f"Full cycle: {total_time:.2f}s")
print(f" Signal collection: {signal_time:.2f}s ({signal_time/total_time*100:.0f}%)")
print(f" Decision making: {decision_time:.2f}s ({decision_time/total_time*100:.0f}%)")
print(f" Trade execution: {execution_time:.2f}s ({execution_time/total_time*100:.0f}%)")Target Breakdown:
- Signal collection: 30-40% of total time
- Decision making: 40-50% of total time
- Trade execution: 10-20% of total time
Cache Performance Metrics
Monitor cache effectiveness:
def log_cache_metrics(orchestrator):
metrics = orchestrator.get_cache_metrics()
print("Cache Performance:")
print(f" Hit rate: {metrics['hit_rate_percent']:.1f}%")
print(f" Total entries: {metrics['total_entries']}")
print(f" Total hits: {metrics['total_hits']}")
print(f" Total misses: {metrics['total_misses']}")
print(f" Avg age: {metrics['avg_age_seconds']:.0f}s")
print(f" Expired entries: {metrics['expired_entries']}")Target Metrics:
- Hit rate: 60-80% (well-tuned TTL)
- Avg age: < 50% of TTL (good utilization)
- Expired entries: < 100 (timely cleanup)
Load Testing Procedures
Test Scenarios
Scenario 1: High-Frequency Fast Loop
Test rapid signal collection:
async def test_high_frequency_fast_loop():
"""Simulate high-frequency trading with fast loop signals."""
orchestrator = SignalOrchestrator()
account_state = get_test_account_state()
iterations = 100
durations = []
for i in range(iterations):
start = time.time()
response = await orchestrator.collect_signals(
SignalRequest(signal_type="fast", account_state=account_state)
)
duration = time.time() - start
durations.append(duration)
if i % 10 == 0:
print(f"Iteration {i}: {duration:.3f}s")
await asyncio.sleep(0.1) # 10 Hz frequency
print(f"\nResults over {iterations} iterations:")
print(f" Mean: {sum(durations)/len(durations):.3f}s")
print(f" Min: {min(durations):.3f}s")
print(f" Max: {max(durations):.3f}s")
print(f" P95: {sorted(durations)[int(len(durations)*0.95)]:.3f}s")Success Criteria:
- Mean < 0.5s
- P95 < 1.0s
- No timeouts
Scenario 2: Concurrent Multi-Loop Collection
Test collecting all signal types simultaneously:
async def test_concurrent_multi_loop():
"""Test concurrent collection of fast, medium, and slow signals."""
orchestrator = SignalOrchestrator()
account_state = get_test_account_state()
start = time.time()
# Collect all signal types concurrently
requests = [
SignalRequest(signal_type="fast", account_state=account_state),
SignalRequest(signal_type="medium", account_state=account_state),
SignalRequest(signal_type="slow", account_state=account_state),
]
responses = await orchestrator.collect_concurrent(requests)
duration = time.time() - start
print(f"Concurrent collection: {duration:.2f}s")
for response in responses:
print(f" {response.signal_type}: confidence={response.signals.metadata.confidence:.2f}")Success Criteria:
- Total time < 15s (concurrent speedup)
- All signals collected successfully
- Confidence > 0.7 for all signals
Scenario 3: Large Portfolio Stress Test
Test with many positions:
async def test_large_portfolio():
"""Test signal collection for large portfolio (50+ positions)."""
orchestrator = SignalOrchestrator()
# Create account state with 50 positions
positions = [
Position(coin=f"COIN{i}", size=1.0, entry_price=100.0, ...)
for i in range(50)
]
account_state = AccountState(positions=positions, ...)
start = time.time()
response = await orchestrator.collect_signals(
SignalRequest(signal_type="medium", account_state=account_state)
)
duration = time.time() - start
print(f"Large portfolio ({len(positions)} positions): {duration:.2f}s")
print(f" Confidence: {response.signals.metadata.confidence:.2f}")Success Criteria:
- Duration < 10s for 50 positions
- Confidence > 0.6
- No timeouts
Benchmarking Tools
Built-in Profiling
Use Python's cProfile for detailed profiling:
python -m cProfile -o profile.stats -m hyperliquid_agent.cli start
# Analyze results
python -c "
import pstats
p = pstats.Stats('profile.stats')
p.sort_stats('cumulative')
p.print_stats(20)
"Custom Benchmarking Script
Create a comprehensive benchmark:
# scripts/benchmark_performance.py
import asyncio
import time
from hyperliquid_agent.signals.orchestrator import SignalOrchestrator
async def run_benchmarks():
orchestrator = SignalOrchestrator()
print("=== Performance Benchmarks ===\n")
# Benchmark 1: Cache hit rate
print("1. Cache Performance")
metrics = orchestrator.get_cache_metrics()
print(f" Hit rate: {metrics['hit_rate_percent']:.1f}%")
print(f" Total entries: {metrics['total_entries']}")
# Benchmark 2: Fast signal latency
print("\n2. Fast Signal Latency")
durations = []
for _ in range(10):
start = time.time()
await orchestrator.collect_signals(...)
durations.append(time.time() - start)
print(f" Mean: {sum(durations)/len(durations)*1000:.0f}ms")
print(f" P95: {sorted(durations)[9]*1000:.0f}ms")
# Benchmark 3: Concurrent fetching
print("\n3. Concurrent Fetching")
start = time.time()
tasks = [orchestrator.hl_provider.fetch_order_book(coin) for coin in test_coins]
await asyncio.gather(*tasks)
duration = time.time() - start
print(f" {len(test_coins)} order books: {duration:.2f}s")
print(f" Effective concurrency: {len(test_coins)/duration:.1f}x")
if __name__ == "__main__":
asyncio.run(run_benchmarks())Run benchmarks:
uv run python scripts/benchmark_performance.pyPerformance Baselines
Establish baseline metrics for your environment:
| Metric | Target | Good | Acceptable | Poor |
|---|---|---|---|---|
| Fast signal latency | < 500ms | < 1s | < 2s | > 2s |
| Medium signal latency | < 3s | < 5s | < 10s | > 10s |
| Slow signal latency | < 10s | < 15s | < 30s | > 30s |
| Cache hit rate | > 70% | > 60% | > 40% | < 40% |
| Concurrent speedup | > 15x | > 10x | > 5x | < 5x |
| Decision latency | < 5s | < 10s | < 20s | > 20s |
| Execution latency | < 1s | < 2s | < 5s | > 5s |
Troubleshooting Performance Issues
High Latency
Symptoms: Signal collection takes > 5s consistently
Diagnosis:
# Check cache hit rate
metrics = orchestrator.get_cache_metrics()
if metrics['hit_rate_percent'] < 40:
print("Low cache hit rate - increase TTL")
# Check network latency
start = time.time()
await orchestrator.hl_provider.fetch_order_book("BTC")
latency = time.time() - start
if latency > 1.0:
print("High network latency - check connection")Solutions:
- Increase cache TTL values
- Enable caching if disabled
- Check network connectivity
- Reduce concurrent request count
Low Cache Hit Rate
Symptoms: Hit rate < 40%
Diagnosis:
metrics = orchestrator.get_cache_metrics()
print(f"Hit rate: {metrics['hit_rate_percent']}%")
print(f"Avg age: {metrics['avg_age_seconds']}s")Solutions:
- Increase TTL values by 50-100%
- Verify caching is enabled
- Check for cache invalidation issues
- Warm cache before trading
High API Costs
Symptoms: Excessive API calls, high costs
Diagnosis:
# Track API call count
call_count = 0
async def tracked_fetch(*args):
global call_count
call_count += 1
return await original_fetch(*args)
# Monitor for 1 hour
print(f"API calls in 1 hour: {call_count}")Solutions:
- Increase cache TTL values
- Disable unused signal providers
- Reduce loop frequencies
- Implement request deduplication
Memory Issues
Symptoms: High memory usage, OOM errors
Diagnosis:
# Check cache database size
ls -lh state/signal_cache.db
# Check Python memory usage
import psutil
process = psutil.Process()
print(f"Memory: {process.memory_info().rss / 1024 / 1024:.0f} MB")Solutions:
- Reduce cache max_size_mb
- Increase cleanup frequency
- Run VACUUM more frequently
- Limit historical data retention
Related Documentation
- Monitoring Architecture - Signal collection system
- Configuration Guide - TTL and timeout settings
- Troubleshooting - Common performance issues