Logic
Scaling Policies
Deep dive into target calculations, latency percentile equations, and cooldown algorithms.
Autoscaling Core Equation
LogStrata uses a modified target tracking algorithm to determine the desired replica count. At each controller tick:
DesiredReplicas = ceil( CurrentReplicas * ( CurrentMetricValue / TargetThreshold ) )
For example, if your current replica count is 4, the current P95 response latency is 1200ms, and your target threshold limit is configured at 800ms:
DesiredReplicas = ceil( 4 * ( 1200 / 800 ) ) = ceil( 4 * 1.5 ) = 6 replicas
Percentile vs Average Metrics
Relying on average latencies often masks extreme outliers (e.g. 5% of users experiencing 10-second wait times). LogStrata recommends configuring P95 or P99 percentiles in the aggregation queries to capture microservice choke points accurately:
# P95 response latency trigger configuration snippet
trigger:
metricName: p95_latency
threshold: 800.0 # Milliseconds
scaleFactor: 1.5 # Boost multiplier when violatedCooldown Restraints
To maintain infrastructure stability, two cooldown windows exist:
- Scale-Up Cooldown: Default
15s. Prevents the controller from spinning up further pods while the previously ordered pods are in theContainerCreatingphase. - Scale-Down Cooldown: Default
90s. Prevents deleting healthy pods too quickly during minor traffic fluctuations, avoiding pod spin-up overhead.