pub(super) fn load_session(
model_path: &Path,
eps: Vec<ExecutionProviderDispatch>,
intra_threads: usize,
) -> Result<Session>Expand description
Builds an ORT session from the ONNX model file with the given execution providers.
intra_threads controls intra-op parallelism for matmul / attention kernels
inside a single session.run() call. The default (1) keeps per-worker RSS
predictable for the workspace probe; raise it to floor(num_cpus / workers)
on under-utilized hosts to recover CPU headroom. See
crate::config::Config::intra_threads for the operator-facing knob.