Skip to main content

run_probe

Function run_probe 

Source
pub(crate) async fn run_probe(
    pool: &EmbedPool,
    max_seq: usize,
    rss_ceiling: usize,
    cgroup_limit_bytes: usize,
) -> (f64, f64)
Expand description

Runs the startup probe on the already-warmed leader worker.

§Arguments

  • pool: the EmbedPool whose leader worker has already loaded models.
  • max_seq: the configured BGE_M3_MAX_SEQ_LENGTH (determines the topmost probe shape). The dynamic (1, max_seq) capability check has been removed — see trim_probe_shapes in the change log.
  • rss_ceiling: the per-worker workspace budget computed from sysinfo. Shapes estimated to exceed this are skipped to avoid OOM mid-probe (the conservative-model guard, unchanged).
  • cgroup_limit_bytes: the actual kernel memory ceiling (cgroup limit or host RAM, whichever was detected first). Used by the absolute-RSS guard: before each shape the current process RSS is measured and the shape is skipped if rss + 4 × estimated_cost > cgroup_limit × 87.5%. This prevents ORT session-arena retention from accumulating past the kernel ceiling across successive probe shapes.

§Returns

(a, b) where a and b are the fitted cost-model coefficients. Returns conservative defaults and logs a warning on any failure.