pub(crate) async fn run_probe(
pool: &EmbedPool,
max_seq: usize,
rss_ceiling: usize,
cgroup_limit_bytes: usize,
) -> (f64, f64)Expand description
Runs the startup probe on the already-warmed leader worker.
§Arguments
pool: theEmbedPoolwhose leader worker has already loaded models.max_seq: the configuredBGE_M3_MAX_SEQ_LENGTH(determines the topmost probe shape). The dynamic(1, max_seq)capability check has been removed — seetrim_probe_shapesin the change log.rss_ceiling: the per-worker workspace budget computed from sysinfo. Shapes estimated to exceed this are skipped to avoid OOM mid-probe (the conservative-model guard, unchanged).cgroup_limit_bytes: the actual kernel memory ceiling (cgroup limit or host RAM, whichever was detected first). Used by the absolute-RSS guard: before each shape the current process RSS is measured and the shape is skipped ifrss + 4 × estimated_cost > cgroup_limit × 87.5%. This prevents ORT session-arena retention from accumulating past the kernel ceiling across successive probe shapes.
§Returns
(a, b) where a and b are the fitted cost-model coefficients.
Returns conservative defaults and logs a warning on any failure.