Logic

Scaling Policies

Deep dive into target calculations, latency percentile equations, and cooldown algorithms.

Autoscaling Core Equation

LogStrata uses a modified target tracking algorithm to determine the desired replica count. At each controller tick:

DesiredReplicas = ceil( CurrentReplicas * ( CurrentMetricValue / TargetThreshold ) )

For example, if your current replica count is 4, the current P95 response latency is 1200ms, and your target threshold limit is configured at 800ms:

DesiredReplicas = ceil( 4 * ( 1200 / 800 ) ) = ceil( 4 * 1.5 ) = 6 replicas

Percentile vs Average Metrics

Relying on average latencies often masks extreme outliers (e.g. 5% of users experiencing 10-second wait times). LogStrata recommends configuring P95 or P99 percentiles in the aggregation queries to capture microservice choke points accurately:

# P95 response latency trigger configuration snippet
trigger:
  metricName: p95_latency
  threshold: 800.0  # Milliseconds
  scaleFactor: 1.5  # Boost multiplier when violated

Cooldown Restraints

To maintain infrastructure stability, two cooldown windows exist:

Scale-Up Cooldown: Default 15s. Prevents the controller from spinning up further pods while the previously ordered pods are in the ContainerCreating phase.
Scale-Down Cooldown: Default 90s. Prevents deleting healthy pods too quickly during minor traffic fluctuations, avoiding pod spin-up overhead.