Skip to main content

Module embedder

Module embedder 

Source
Expand description

Worker-pool–driven BGE-M3 embedding service.

Submodules:

  • types: public DTOs and the internal EmbedRequest enum.
  • error: small ort::Error β†’ anyhow::Error adapter.
  • model_files: hf-hub download / cache layout for the ONNX model files.
  • tokenize: tokenizer load + no-pad tokenization + chunk-array build.
  • session: ORT execution-provider config and session loading.
  • math: pure dense/sparse math helpers (testable without ORT).
  • dense: dense embedding pipeline.
  • sparse: BGE-M3 SPLADE-style sparse embedding pipeline.
  • dual: paired dense + sparse embedding pipeline (one forward pass).
  • worker: blocking worker thread, request dispatch, probe wiring.
  • pool: EmbedPool async wrapper and test helpers.

ModulesΒ§

dense πŸ”’
Dense embedding pipeline.
dual πŸ”’
Paired dense + sparse embedding pipeline (one forward pass per chunk).
error πŸ”’
Error helpers for the embedder.
math πŸ”’
Pure dense/sparse math helpers (testable without ORT).
model_files πŸ”’
HuggingFace Hub download + cache-layout helpers for the BGE-M3 model files.
pool πŸ”’
EmbedPool async wrapper around the worker thread pool.
session πŸ”’
ORT execution-provider configuration and session loading.
sparse πŸ”’
BGE-M3 SPLADE-style sparse embedding pipeline.
tokenize πŸ”’
Tokenizer load + no-pad tokenization + chunk-array build helpers.
types πŸ”’
Public DTOs and the internal EmbedRequest enum exchanged between the pool and the worker threads.
worker πŸ”’
Blocking worker thread, request dispatch, and probe wiring.

StructsΒ§

EmbedPool
Async handle to the embedding worker thread pool.