Expand description
Public DTOs and the internal EmbedRequest enum exchanged between the
pool and the worker threads.
Structsยง
- Dual
Embedding - Paired dense + sparse embeddings produced from a single forward pass.
- Embed
Stats - Per-request diagnostic statistics captured inside the worker and forwarded to the handler layer for inclusion in the completion log event.
- Probe
Result ๐ - Result of a single probe
session.run()call. - Sparse
Embedding - Sparse embedding output from the BGE-M3 sparse-linear projection layer.
Enumsยง
- Embed
Request ๐
Constantsยง
- OS_
HEADROOM_ ๐BYTES - OS headroom reserved for kernel, stack, ORT arena, and other non-model allocations. Subtracted from available memory before computing per-worker workspace.