Skip to main content

fit_cost_model

Function fit_cost_model 

Source
pub(crate) fn fit_cost_model(data: &[DataPoint]) -> Option<(f64, f64)>
Expand description

Fits peak = a * (batch * seq) + b * (batch * seq^2) via ordinary least squares (no intercept — workspace at batch=0 is 0 by definition).

The design matrix X has columns [batch*seq, batch*seq^2] and the response y is rss_delta for each observation.

Normalization: columns are scaled to [0, 1] before solving (ξ1 = x1 / max(x1), ξ2 = x2 / max(x2)). Without this, x2 at max_seq=8192 exceeds x1 by ~8000×, making the Gram matrix effectively rank-1 under the naïve det threshold and causing the fit to silently fall back to conservative defaults despite valid data.

Normal equations solved in normalized space via 2×2 matrix inverse (Cramer’s rule), then unscaled: a = α / x1_max, b = β / x2_max.

Returns None when:

  • Fewer than 2 data points (under-determined system).
  • x1_max or x2_max is zero (degenerate data).
  • The normalized Gram matrix is nearly singular (det < 1e-6 of max diagonal²).
  • Either coefficient is negative (physically impossible workspace).